Motivation: In recommendation and review systems, users might not only provide overall ratings but also write more informative reviews. In reviews, users usually express their sentiments, discuss the aspects they like or do not like, and provide interpretations about their ratings. It is necessary to exploit such implicit comments to boost recommendation performance. For such purpose, this paper determines to model ratings and sentiments in comments in a per-aspect way, and finally they propose a probabilistic model based on topic modeling and collaborative filtering that holds superior performance.
|
Fig. 1 Rating and review model in a per-aspect way. It consists of two parts: modeling ratings and modeling reviews. |
Approaches: Given a user and a movie, their task is to predict (i) the observed rating as well as (ii) the review. (i) In terms of ratings, they assume that any observed overall rating is generated from individual aspect-specific ratings, including rating for actors, rating for plots and rating for visual effect etc. Different aspects (actors, plot, visual effect) may hold different importance weights. Larger value implies that user has an interest in such aspect and the movie also highlights such aspect. Simply speaking, high overall rating is generated through a good matching between user's expectations and movie's quality in those aspects that user cares most. (ii) When writing a review, the user might talk about movie specific contents, and also express aspect specific judgements (sentiments). To describe such variety, they assume that the review language model contains five components:
1. A background language component;
2. A background sentiment language component;
3. A movie-specific language component;
4. An aspect-specific language component;
5. An aspect-specific sentiment language component.
Fig.1 illustrate the entire framework of their probabilistic model. User's interest and movie's relevance are collectively used to model the final rating and review contents. Their model is called "JMARS".
Experiments: The experiment is conducted on a dataset collected from IMDb - a famous movie review website. In total, 50k movies along with their reviews are crawled. They use 80% of data as training data, 10% as validation and 10% for testing. Fig. 2 shows comparison of JMARS in terms of perplexity. When factor size is set as 5 or 10 (5 or 10 aspects are considered in movie), JMARS could always outperforms HFT approach. Besides, Fig. 3 reveals MSE comparison which also proves JMARS good performance.
|
Fig. 2 Comparison of models using perplexity. |
|
Fig. 3 Comparison of models in terms of MSE. |
No comments:
Post a Comment