Rating prediction on Yelp Academic Dataset using Paragraph Vectors
This work studies the application of Paragraph Vectors to the Yelp Academic Dataset reviews in order to predict user ratings for different categories of businesses like auto repair, restaurants or veterinarians. Paragraph Vectors is a word embeddings techniques were each word or piece of text is converted to a continuous low dimensional space. Then, the opinion mining or senti-ment analysis is observed as a classiﬁcation task, where each user review is associated with a label - the rating - and a probabilistic model is built with a logistic classiﬁer. Following the intuition that the semantic information pre-sent in textual user reviews is generally more complex and complete than the numeric rating itself, this work applies Paragraph Vectors successfully toYelp dataset and evaluates its results.
Index terms - Prediction, Paragraph Vectors, Learning-to-Rank, dimension reduce.