Get Recced!

App Recommendations using Reviews and Ratings!

In this project we examined how recommender systems work (better or worse) if we take advantage of the review texts along with the review ratings. We aimed to combine latent ratings with latent review topics and analyze the results. Our assumption was that combining the review text with ratings would help the recommender system make better predictions. Hence, we compared different models on Amazon Apps dataset and calculated RMSE.


Background


Recommender Systems mainly rely on human feedback in the form of ratings and reviews. Hence, the system suffers through the problem of cold start where new user or item does not have much of the feedback available. This makes the initial feedback a lot invaluable. Intuitively, one review gives a lot more information about the user compared to one rating.

In spite of the wealth of research on modeling ratings, the other form of feedback present on review websites—namely, the reviews themselves—is typically ignored. In our opinion, ignoring this rich source of information is a major shortcoming of existing work on recommender systems. Indeed, if our goal is to understand (rather than merely predict) how users rate products, we ought to rely on reviews, whose very purpose is for users to explain why they rated a product the way they did.[1]




Dataset


Amazon Apps

We used the “Apps for Android” 5-core Dataset which contains both product ratings and reviews. The dataset contains 752,937 entries with 87,271 users and 13,209 products (mobile apps). The data is from Amazon.com and has been collected by Julian McAuley, UCSD. Below is the snippet of the data:

{
    "reviewerID": "A1N4O8VOJZTDVB", 
    "asin": "B004A9SDD8", 
    "reviewerName": "Annette Yancey", 
    "helpful": [1, 1], 
    "reviewText": "Loves the song, so he really couldn't wait to play this. A little less interesting for him so he doesn't play long, but he is almost 3 and likes to play the older games, but really cute for a younger child.", 
    "overall": 3.0, 
    "summary": "Really cute", 
    "unixReviewTime": 1383350400, 
    "reviewTime": "11 2, 2013"
}

{
    "reviewerID": "A2HQWU6HUKIEC7", 
    "asin": "B004A9SDD8", 
    "reviewerName": "Audiobook lover \"Kathy\"", 
    "helpful": [0, 0], 
    "reviewText": "Oh, how my little grandson loves this app. He's always asking for \"Monkey.\" Grandma has tired of it long before he has. Finding the items on each page that he can touch and activate is endlessly entertaining for him, at least for now. Well worth the $.99.", "overall": 5.0, "summary": "2-year-old loves it", 
    "unixReviewTime": 1323043200, 
    "reviewTime": "12 5, 2011"
}


Analysis




Methodology


Preprocessing

Experimentation

We Compared and evaluated the following recommendation models:

Global Average Model

We started with this approach to see what results we get if we use a model as straightforward as this one. In this, we predict the global average as the rating for each user-item pair.

Formula:

Global Average Mean


Baseline Model

This model is an improvement over the first one. In addition to the global average, we also consider a user bias and a movie bias while prediction.

Baseline


Collaborative Filtering

In this approach we considered user-user collaborative filtering. We have used Pearson correlation to calculate similarity between two users. Also, we use 10 nearest neighbor approach for predicting the rating and for the recommendations.

Collaborative Filtering


Latent Dirichlet Allocation (LDA)

Every word in a review texts belongs to one or more topics. Using LDA, we can only observe the text and words, not the topic themselves. We tried to find these hidden topics in the text reviews and below is the distribution of these topics. LDA Once we get the word distribution per topic, for each document we calculate the topic distributions. Here, all the reviews on one product are considered as one document. Similarly, all the reviews given by one user are considered as one document.

LDA


Hidden Factors as Topics (HFT)

HFT model takes advantage of both ratings and reviews by combining latent factor model and latent dirichlet allocation model. Here, for each user and item, we calculate 5 latent factors which are also known as user preferences and product properties. The weights for these factors are learnt by considering the reviews in addition to global average ratings and biases.

HFT


Reverse ASIN Lookup




Results


We compared our HFT model against 5 baseline methods. Each of the respective baseline method used has been explained in detail in the [[Methodology]] section.

Evaluation Metric

Our metrics for evaluation are Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), the formula for which are given below:

RMSE:

RMSE

MAE:

MAE




Findings

Table Results




Discussion


Challenges

Takeaways

Future Work

Ethical Considerations




Technologies Used


Being a Graduate Course in Computer Science and Engineering (CSE) at Texas A&M University, the CSCE 670 - Information Storage and Retrieval course would obviously require us to write code as a part of the submission.

We outline below at a high level what programming languages we used, as well as which libraries and what purpose they were used for.

Programming Languages

Python

C++

Libraries/Packages

Library Name Usage
Pandas Reading and writing data in ‘dataframes’ in Python
SkLearn, Scipy, Numpy Matrix operations, calculation of RMSE, MAE, Pearson correlation, etc
Surprise Implementation of SVD Approach
Matplotlib Plotting data for analysis
Requests Hitting URLs for Asin lookup
BeautifulSoup HTML Parsing
pyLDAvis Visualization of LDA topics




Poster


References


[1] Julian McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: understanding rating dimensions with review text. In Proceedings of the 7th ACM conference on Recommender systems (RecSys ‘13). ACM, New York, NY, USA, 165-172. DOI=http://dx.doi.org/10.1145/2507157.2507163

[2] Yu, Kuifei, et al. “Towards personalized context-aware recommendation by mining context logs through topic models.” Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Berlin, Heidelberg, 2012.

[3] Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. 2009.

[4] Yehuda Koren. Factor in the neighbors: scalable and accurate collaborative filtering. 2010. URL: http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf.

[5] Breese, John S., David Heckerman, and Carl Kadie. “Empirical analysis of predictive algorithms for collaborative filtering.” Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 1998. URL: https://dl.acm.org/citation.cfm?id=2074100