Getting Started with the Evaluator

The LensKit evaluator lets you train algorithms over data sets, measure their performance, and cross-validate the results for robustness. This page describes how to get started using the evaluator for a simple experiment.

We will use the LensKit command line tool from the binary distribution to run the experiment. You can also run experiments from more sophisticated build tools such as Gradle.

Prerequisites

To run the evaluator, you’ll need the following:

Java 6 or later (Java 7 or later is best).
The LensKit binary distribution.
A tool for analyzing the results. For this example, we will use R with ggplot2.
The MovieLens 100K data set.

Creating the Evaluation Script

The core of an experiment is the evaluation script, typically called eval.groovy:

import org.grouplens.lenskit.iterative.*
import org.grouplens.lenskit.knn.item.*
import org.grouplens.lenskit.mf.funksvd.*
import org.grouplens.lenskit.transform.normalize.*

trainTest {
    dataset crossfold("ml-100k") {
        source csvfile("ml-100k/u.data") {
            delimiter "\t"
            domain {
                minimum 1.0
                maximum 5.0
                precision 1.0
            }
        }
    }

    algorithm("PersMean") {
        bind ItemScorer to UserMeanItemScorer
        bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
    }

    algorithm("ItemItem") {
        bind ItemScorer to ItemItemScorer
        bind UserVectorNormalizer to BaselineSubtractingUserVectorNormalizer
        within (UserVectorNormalizer) {
            bind (BaselineScorer, ItemScorer) to ItemMeanRatingItemScorer
        }
    }

    algorithm("FunkSVD") {
        bind ItemScorer to FunkSVDItemScorer
        bind UserVectorNormalizer to BaselineSubtractingUserVectorNormalizer
        bind (BaselineScorer, ItemScorer) to UserMeanItemScorer
        bind (UserMeanBaseline, ItemScorer) to ItemMeanRatingItemScorer
        set FeatureCount to 40
        set LearningRate to 0.002
        set IterationCount to 125
    }

    metric CoveragePredictMetric
    metric RMSEPredictMetric
    metric NDCGPredictMetric

    output "eval-results.csv"
}

Unpack your MovieLens data set (your current directory should have an eval.groovy file and a ml-100k directory), and run the script using the lenskit program from the binary distribution¹:

$ lenskit eval

This does does a few things:

Splits the MovieLens 100K data set into 5 partitions for cross-validation. These partitions are stored under ml-100k-crossfold.
Generates predictions for test user/item pairs using three algorithms: personalized mean, item-item CF, and Funk-SVD.
Evaluates these two algorithms with three metric families: coverage, RMSE, and nDCG.
Writes the evaluation results to eval-results.csv, one row for each combination of algorithm and fold.

Analyzing the Output

LensKit produces a CSV file containing the evaluation results. This can be analyzed with your choice of tool, for example R:

library(ggplot2)

eval.results = read.csv('eval-results.csv')

png('results.png')
qplot(Algorithm, RMSE.ByUser, data=eval.results, geom='boxplot')
dev.off()

This will produce a box plot of per-user RMSE:

Per-user RMSE box plot

Getting Started with the Evaluator

Prerequisites

Creating the Evaluation Script

Analyzing the Output

Further Reading