2012-02-23

Similarity-based Recommendation Engines

I am currently participating in the Neo4j-Heroku Challenge. My entry is a -- as yet, unfinished -- beer rating and recommendation service called FrostyMug. All the major functionality is complete, except for the actual recommendations, which I am currently working on. I wanted to share some of my thoughts and methods for building the recommendation engine.

My starting point for the recommendation engine was a blog post by Marko Rodriguez, A Graph-Based Movie Recommender Engine. Marko's approach is pretty straightforward and he does an excellent job of explaining the subtleties of collaborative and content-based filtering. It was also a great way to get acquainted with the Gremlin graph traversal API, which I have been unwisely avoiding for fear of it being too complicated.

Basic collaborative filtering works like this: if I rate a beer, say, Lonerider Sweet Josie Brown Ale as 9 out of 10, and you rate that same beer as 8 out of 10, then I should be able to get recommendations based on other beers that you (and others who have rated that beer similarly to me) have rated highly.

I think we can improve on this. You and I happen have rated one beer the same way. What happens if that is the only beer we have rated similarly? What if I have rated Fullsteam Carver Sweet Potato Ale at 8 out of 10, but you have rated it 2 out of 10? You have shown that you have poor taste, so why should I get recommendations based on your other ratings, simply because we happened to rate a single beer similarly?

With a small modification, we can overcome this problem. Instead of basing recommendations off of one similar rating, I can calculate how similarly you and I rated all the things we have rated, and only get recommendations from you if I have determined we are similar enough in our tastes. Call this "similarity-based collaborative filtering."

Finding similar users works like this:
  1. Find all users who have rated the same beers as I have.
  2. For each rated beer, calculate the differences between my rating and that user's rating.
  3. For each user, average the differences.
  4. Return all users where the average difference is less than or equal to some tolerance (2 in the below example.)

Let's see what this looks like in Gremlin, a domain specific graph traversal language in Groovy. Gremlin is built on the concept of "pipes" where a single chain of operations runs from beginning to end on each element put into the pipe. The elements in our case are the user nodes, beer nodes and rating edges that connect them in our graph:
// We're going to use this again and again, so make a single "step" for it,
// to which we can pass a "similarity threshold"
Gremlin.defineStep('similarUser', [Vertex,Pipe], {Integer threshold ->

    // Capture the incoming pipe
    target = _();

    // Set up a table to hold ratings for averages
    m=[:].withDefault{[0,0]};

    // Find all beers rated by the target, and store each rating
    target.outE("RATED").sideEffect{w=it.rating}

    // For each rated beer, find all the users who have also rated that beer
    // and who are not the target user, then go back to their ratings
    .inV.inE("RATED").outV.except(target).back(2)

    // At this point in the pipeline, we are iterating over
    // only the ratings of beers that both users have rated.
    // For each rating, calculate the difference between
    // the target user's rating and the compared user's rating
    .sideEffect{diff=Math.abs(it.rating-w)}

    // For each compared user, store the number of beers rated in common
    // and the total difference for all common ratings. 
    .outV.sideEffect{ me=m[it.id]; me[0]++; me[1]+=diff; }.iterate();

    // Find the average difference for each user and filter
    // out any where the average is greater than two.
    m.findAll{it.value[1]/it.value[0] <= threshold}.collect{g.v(it.key)}._()
});

// Return all the users who are similar to our target
// with an average similarity of 2 or less
g.v(123).similarUser(2)

The result of this script is a list of all user nodes that are considered "similar" to the target user. Now we can get more accurate recommendations, based on users whose tastes are similar to ours. Given users who are similar to me, recommend beers that they rated highly:
  1. Find all beers rated by similar users.
  2. For each beer, average the similar users' ratings.
  3. Sort the beers by highest to lowest average and return the top 25.
r=[:].withDefault{[0,0]};

// Find similar users 
g.v(123).similarUser(2)

// Find all outgoing beer ratings by similar users
// and the beer they are rating
.outE("RATED").sideEffect{rating=it.rating}.inV

// Store the ratings in a map for averaging
.sideEffect{me=r[it.id]; me[0]++; me[1]+=rating; }

// Transform the map into a new map of beer id : average
r.collectEntries{key, value -> [key , value[1]/value[0]]}

// sort by highest average first, then take the top 25
.sort{a,b -> b.value <=> a.value}[0..24]

// Transform the map back into a mapping of beer node and rating
.collect{key,value -> [g.v(key), value]}
Now I only get recommendations from users whose tastes are similar to mine.

I can also calculate an estimated rating for any beer that I have not rated. To see how that might be useful, consider Netflix. When you look at any movie on Netflix that you have not already rated, they show you their best guess for how you would like that movie. This can be helpful in determining which of a set of movies you want to watch, or in our case, which beer you might want to try given limited pocket money. Again, we can use our similar users to figure out the estimated rating of a given beer:
  1. Find all ratings of the target beer by similar users.
  2. Return the average of their ratings of that beer.
// Grab our target beer
beer=g.v(987)

// Find similar users 
g.v(123).similarUser(2)

// For each similar user, find all that have rated the target beer
.outE("RATED").inV.filter{it==beer}

// Go back to their ratings, and average them
.back(2).rating.mean()
So now we have a simple recommendation engine based on ratings by users with similar tastes. There are a few improvements that could still be made:
  • Don't consider a user "similar" unless they have rated at least 10 beers in common (or some other threshold). This prevents users who have only rated a few beers in common from being falsely labeled as similar.
  • Calculate the similar users ahead of time, and store them with a "SIMILAR" relationship to the target user. Periodically recalculate the similar users to keep the list fresh.
  • Use random sampling or size limiting on the number of users and ratings checked when determining similarity, trading accuracy for performance.
Many thanks to Marko for his help on the mailing list, and with giving me some pointers in writing this, and for being a pretty awesome guy in general.

Update: Luanne Misquitta posted about how to do similarity-based recommendations with Cypher. Thanks, Luanne!