Blog Comments Sentiment Analysis For Estimating Filipino Isp Customer Satisfaction
Blog comments have become one of the most common means for people to express and exchange their personal opinions. The study proposed and developed a system for automated opinion retrieval from blog comments to estimate the customer satisfaction for three main Filipino Internet Service Providers (ISPs). Data were first gathered from comments located in some of the most popular blog sites discussing the Filipino ISPs. Automatic word seeding, N-gram tokenization, stemming and other Sentiment Analysis (SA) techniques were applied to extract useful informationfrom the textual data. The data collected were manually labeled in order to establish ground truth. Furthermore,the study experimented bag of words and Ruled-based method of identifying the polarity of the blog comments sentences. In addition, Na´ve Bayes (NB) and Support Vector Machine (SVM) language, N-gram and preprocessing featuressuch as stopwords and stemmer were applied and experimented. The results of the experimentation showed N-gram has a significant effect on the increase of performance of the proposed automated classifier. In contrast, preprocessing features such as stopwords and stemmer did little help to increase the performance of the said proposed automated classifier. The application of SVM learning machine and bi-gram feature on the proposed model were resulted in higher classification performance and far above the baseline classification performance of the previous researches. The impact of adding ruled based can indeed be a big help to automatically determine the sentiments of blog sentences. These good results indicate that it can be possible for some interested parties to have a sense of the sentiments of their customers by applying some automated sentiment analysis on blog comments.
Index Terms- Bag of Words, Blogs, Na´ve Bayes, Rule-based, Sentiment Analysis, Support Vector Machines, Stopwords, Stemmer, Translator Machine.