Tweets Classification Using Contextual Knowledge and Boosting
Understanding the data is one of the key issues for successful text categorization. Text classification dealing with the bag-of-words representation usually brings sparse, incomplete, and noisy data due to the tedious nature of labeling, and the difficulty of discovering knowledge from unstructured text. Unfortunately, low quality features in the training data is more likely to contribute negatively to the classification results. This work combines classification algorithms, namely, TFIDF, SVM, and Na´ve Bayesian with contextual knowledge and boosting to identify tweets of fanaticism in the domain of Twitter. Our results show empirically that performance has been enhanced in terms of accuracy and recall.
Keywords - Boosting, Classification, Data Mining, Explanation Patterns, Text mining