org.grouplens.lenskit.knn.item (LensKit 2.2.1)

Interface Summary
Interface	Description
ItemItemModelBackedGlobalScorer	The global scorer for the global recommendation backed by a item-item model.
ItemScoreAlgorithm	Algorithm for scoring items given an item-item model and neighborhood scorer.
ItemSimilarity	Compute the similarity between two items.
NeighborhoodScorer	Compute scores from neighborhoods and score vectors.

Class Summary
Class	Description
DefaultItemScoreAlgorithm	Default item scoring algorithm.
ItemItemGlobalScorer	Score items based on the basket of items using an item-item CF model.
ItemItemScorer	Score items using an item-item CF model.
ItemVectorSimilarity	Implementation of `ItemSimilarity` that delegates to a vector similarity.
SimilaritySumNeighborhoodScorer	Neighborhood scorer that computes the sum of neighborhood similarities.
WeightedAverageNeighborhoodScorer	Neighborhood scorer that computes the weighted average of neighbor scores.

Annotation Types Summary
Annotation Type	Description
ItemSimilarityThreshold	Qualifier for threshold applied to item similarities.
ModelSize	Number of neighbors to retain in the similarity matrix.

Package org.grouplens.lenskit.knn.item Description

Implementation of item-item collaborative filtering.

The item-item CF implementation is built up of several pieces. The model builder takes the rating data and several parameters and components, such as the similarity function and model size, and computes the similarity matrix. The scorer uses this model to score items.

The basic idea of item-item CF is to compute similarities between items, typically based on the users that have rated them, and the recommend items similar to the items that a user likes. The model is then truncated — only the ModelSize most similar items are retained for each item – to save space. Neighborhoods are further truncated when doing recommendation; only the NeighborhoodSize most similar items that a user has rated are used to score any given item. ModelSize is typically larger than NeighborhoodSize to improve the ability of the recommender to find neighbors.

When the similarity function is asymmetric (\(s(i,j)=s(j,i)\) does not hold), some care is needed to make sure that the function is used in the correct direction. Following Deshpande and Karypis, we use the similarity function as \(s(j,i)\), where \(j\) is the item the user has purchased or rated and \(i\) is the item that is going to be scored. This function is then stored in row \(i\) and column \(j\) of the matrix. Rows are then truncated (so we retain the ModelSize most similar items for each \(i\)); this direction differs from Deshpande & Karypis, as row truncation is more efficient & simpler to write within LensKit's item-item algorithm structure, and performs better in offline tests against the MovieLens 1M data set (see writeup). Computation against a particular item the user has rated is done down that item's column.

The scorers and recommenders actually operate on a generic ItemItemModel, so the item-based scoring algorithm can be used against other sources of similarity, such as similarities stored in a database or text index.