Getting Started with the Evaluator
The LensKit evaluator lets you train algorithms over data sets, measure their performance, and cross-validate the results for robustness. This page describes how to get started using the evaluator for a simple experiment.
We will use the LensKit command line tool from the binary distribution to run the experiment. You can also run experiments from more sophisticated build tools such as Gradle.
Prerequisites
To run the evaluator, you’ll need the following:
- Java 6 or later (Java 7 or later is best).
- The LensKit binary distribution.
- A tool for analyzing the results. For this example, we will use R with ggplot2.
- The MovieLens 100K data set.
Creating the Evaluation Script
The core of an experiment is the evaluation script, typically called eval.groovy
:
import org.grouplens.lenskit.iterative.*
import org.grouplens.lenskit.knn.item.*
import org.grouplens.lenskit.mf.funksvd.*
import org.grouplens.lenskit.transform.normalize.*
trainTest {
dataset crossfold("ml-100k") {
source csvfile("ml-100k/u.data") {
delimiter "\t"
domain {
minimum 1.0
maximum 5.0
precision 1.0
}
}
}
algorithm("PersMean") {
bind ItemScorer to UserMeanItemScorer
bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
}
algorithm("ItemItem") {
bind ItemScorer to ItemItemScorer
bind UserVectorNormalizer to BaselineSubtractingUserVectorNormalizer
within (UserVectorNormalizer) {
bind (BaselineScorer, ItemScorer) to ItemMeanRatingItemScorer
}
}
algorithm("FunkSVD") {
bind ItemScorer to FunkSVDItemScorer
bind UserVectorNormalizer to BaselineSubtractingUserVectorNormalizer
bind (BaselineScorer, ItemScorer) to UserMeanItemScorer
bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
set FeatureCount to 40
set LearningRate to 0.002
set IterationCount to 125
}
metric CoveragePredictMetric
metric RMSEPredictMetric
metric NDCGPredictMetric
output "eval-results.csv"
}
Unpack your MovieLens data set (your current directory should have an
eval.groovy
file and a ml-100k
directory), and run the script using the
lenskit
program from the binary distribution1:
$ lenskit eval
This does does a few things:
- Splits the MovieLens 100K data set into 5 partitions for cross-validation.
These partitions are stored under
ml-100k-crossfold
. - Generates predictions for test user/item pairs using three algorithms: personalized mean, item-item CF, and Funk-SVD.
- Evaluates these two algorithms with three metric families: coverage, RMSE, and nDCG.
- Writes the evaluation results to
eval-results.csv
, one row for each combination of algorithm and fold.
Analyzing the Output
LensKit produces a CSV file containing the evaluation results. This can be analyzed with your choice of tool, for example R:
library(ggplot2)
eval.results = read.csv('eval-results.csv')
png('results.png')
qplot(Algorithm, RMSE.ByUser, data=eval.results, geom='boxplot')
dev.off()
This will produce a box plot of per-user RMSE:
Further Reading
- This whole project can be cloned from GitHub.
- Walk through the eval script
- Example code integrating a LensKit evaluation with custom Java code
-
Without any options, the
eval
LensKit command runs the evaluation defined in the fileeval.groovy
. If you want to use anothe file name, specify it with-f file.groovy
, just likemake
. This is useful for having multiple different evaluations in the same directory. ↩