| Constructor and Description |
|---|
CrossfoldTask() |
CrossfoldTask(String n) |
| Modifier and Type | Method and Description |
|---|---|
protected void |
createTTFiles()
Write train-test split files
|
protected File[] |
getFiles(String pattern)
Get the list of files satisfying the specified name pattern
|
boolean |
getForce() |
Holdout |
getHoldout() |
boolean |
getIsolate()
Query whether this task will produce isolated data sets.
|
CrossfoldMethod |
getMethod()
Get the method to be used for crossfolding.
|
String |
getName()
Get the visible name of this crossfold split.
|
int |
getPartitionCount()
Get the number of folds.
|
int |
getSampleSize() |
DataSource |
getSource()
Get the data source backing this crossfold manager.
|
String |
getTestPattern() |
String |
getTrainPattern() |
List<TTDataSet> |
getTTFiles()
Get the train-test splits as data sets.
|
boolean |
getWriteTimestamps()
Query whether timestamps will be written.
|
protected DataSource |
makeDataSource(File file) |
protected RatingWriter |
makeWriter(File file) |
List<TTDataSet> |
perform()
Run the crossfold command.
|
CrossfoldTask |
setCache(boolean on)
Configure whether the data sets created by the crossfold will have
caching turned on.
|
CrossfoldTask |
setForce(boolean force)
Set the force running option of the command.
|
CrossfoldTask |
setHoldout(int n)
Set holdout to a fixed number of items per user.
|
CrossfoldTask |
setHoldoutFraction(double f)
Set holdout to a fraction of each user's profile.
|
CrossfoldTask |
setIsolate(boolean on)
Configure whether the train-test data sets generated by this task will be isolated.
|
CrossfoldTask |
setMethod(CrossfoldMethod m)
Set the crossfold method.
|
CrossfoldTask |
setOrder(Order<Rating> o)
Set the order for the train-test splitting.
|
CrossfoldTask |
setPartitions(int partition)
Set the number of partitions to generate.
|
CrossfoldTask |
setRetain(int n)
Set holdout from using the retain part to a fixed number of items.
|
CrossfoldTask |
setSampleSize(int n)
Set the sample size (# of users sampled per partition).
|
CrossfoldTask |
setSource(DataSource source)
Set the input data source.
|
void |
setSplitUsers(boolean splitUsers)
Deprecated.
Use
setMethod(CrossfoldMethod) instead. |
CrossfoldTask |
setTest(String pat)
Set the pattern for the test set files.
|
CrossfoldTask |
setTrain(String pat)
Set the pattern for the training set files.
|
CrossfoldTask |
setWriteTimestamps(boolean pack)
Configure whether to include timestamps in the output file.
|
protected Long2IntMap |
splitUsers(UserDAO dao)
Split users ids to n splits, where n is the partitionCount
|
String |
toString() |
protected void |
writeRating(TableWriter writer,
Rating rating)
Writing a rating event to the file using table writer
|
protected void |
writeTTFilesByRatings(RatingWriter[] trainWriters,
RatingWriter[] testWriters)
Write the split files by Ratings from the DAO
|
protected void |
writeTTFilesByUsers(RatingWriter[] trainWriters,
RatingWriter[] testWriters)
Write the split files by Users from the DAO using specified holdout method
|
execute, getProject, setName, setProjectaddListener, cancel, get, get, interruptTask, isCancelled, isDone, set, setException, wasInterruptedclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitaddListenerpublic CrossfoldTask()
public CrossfoldTask(String n)
public CrossfoldTask setPartitions(int partition)
partition - The number of paritionspublic int getSampleSize()
public CrossfoldTask setSampleSize(int n)
CrossfoldMethod.SAMPLE_USERS.n - The number of users to sample for each partition.public CrossfoldTask setTrain(String pat)
pat - The training file name pattern.String.format(String, Object...)public CrossfoldTask setTest(String pat)
pat - The test file name pattern.setTrain(String)public CrossfoldTask setOrder(Order<Rating> o)
o - The sort order.RandomOrder,
TimestampOrder,
setHoldoutFraction(double),
setHoldout(int)public CrossfoldTask setHoldout(int n)
CrossfoldMethod.PARTITION_USERS.n - The number of items to hold out from each user's profile.public CrossfoldTask setRetain(int n)
CrossfoldMethod.PARTITION_USERS.n - The number of items to train data set from each user's profile.public CrossfoldTask setHoldoutFraction(double f)
CrossfoldMethod.PARTITION_USERS.f - The fraction of a user's ratings to hold out.public CrossfoldTask setSource(DataSource source)
source - The data source to use.public CrossfoldTask setForce(boolean force)
force - The force to run option@Deprecated public void setSplitUsers(boolean splitUsers)
setMethod(CrossfoldMethod) instead.splitUsers - true to split by users (CrossfoldMethod.PARTITION_USERS),
false to split by rating (CrossfoldMethod.PARTITION_RATINGS).public CrossfoldMethod getMethod()
public CrossfoldTask setMethod(CrossfoldMethod m)
CrossfoldMethod.PARTITION_USERS.m - The crossfold method to use.public CrossfoldTask setCache(boolean on)
on - Whether the data sets returned should cache.public CrossfoldTask setIsolate(boolean on)
on - true to produce isolated data sets.public boolean getIsolate()
true if this task will produce isolated data sets.public CrossfoldTask setWriteTimestamps(boolean pack)
pack - true to include timestamps (the default), false otherwise.public boolean getWriteTimestamps()
true if output will include timestamps.public String getName()
getName in class AbstractTask<List<TTDataSet>>public String getTrainPattern()
public String getTestPattern()
public DataSource getSource()
public int getPartitionCount()
public Holdout getHoldout()
public boolean getForce()
public List<TTDataSet> perform() throws TaskExecutionException
perform in class AbstractTask<List<TTDataSet>>TaskExecutionExceptionprotected File[] getFiles(String pattern)
pattern - The file name patternprotected void createTTFiles()
throws IOException
IOException - if there is an error writing the files.protected void writeTTFilesByUsers(RatingWriter[] trainWriters, RatingWriter[] testWriters) throws TaskExecutionException
trainWriters - The tableWriter that write train filestestWriters - The tableWriter that writ test filesTaskExecutionExceptionprotected void writeTTFilesByRatings(RatingWriter[] trainWriters, RatingWriter[] testWriters) throws TaskExecutionException
trainWriters - The tableWriter that write train filestestWriters - The tableWriter that writ test filesTaskExecutionExceptionprotected void writeRating(TableWriter writer, Rating rating) throws IOException
writer - The table writer to output the ratingrating - The rating event to outputIOException - The writer IO errorprotected Long2IntMap splitUsers(UserDAO dao)
dao - The DAO of the source filepublic List<TTDataSet> getTTFiles()
protected RatingWriter makeWriter(File file) throws IOException
IOExceptionprotected DataSource makeDataSource(File file)