public class MutualInformationVectorSimilarity extends Object implements VectorSimilarity, Serializable
Similarity function that assumes the two vectors are paired samples from 2 correlated random variables. Using this we estimate the mutual information between the two variables.
Note, this uses the naive estimator of mutual information, which can be heavily biased when the two vectors have little overlap.
| Constructor and Description |
|---|
MutualInformationVectorSimilarity(Quantizer quantizer)
Construct a new mutual information similarity.
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
isSparse()
Query whether this similarity function is sparse (returns 0 for vectors with disjoint key sets).
|
boolean |
isSymmetric()
Query whether this similarity function is symmetric.
|
double |
similarity(SparseVector vec1,
SparseVector vec2)
Compute similarity using mutual information.
|
@Inject public MutualInformationVectorSimilarity(Quantizer quantizer)
Construct a new mutual information similarity.
quantizer - A quantizer to allow discrete mutual information to be computed.public double similarity(SparseVector vec1, SparseVector vec2)
Compute similarity using mutual information.
Note, this similarity function measures the absolute correlation between two vectors. Because of this it ranges from [0,inf), not [-1,1] as specified by superclass. Caution should be used when using this vector similarity function that your implementation will accept values in this range.
similarity in interface VectorSimilarityvec1 - The first vector.vec2 - The second vector.VectorSimilarity.similarity(SparseVector, SparseVector)public boolean isSparse()
VectorSimilarityQuery whether this similarity function is sparse (returns 0 for vectors with disjoint key sets).
isSparse in interface VectorSimilaritytrue iff VectorSimilarity.similarity(SparseVector, SparseVector) will always return true when applied to two vectors with no keys in common.public boolean isSymmetric()
VectorSimilarityQuery whether this similarity function is symmetric. Symmetric similarity functions return the same result when called on (A,B) and (B,A).
isSymmetric in interface VectorSimilaritytrue if the function is symmetric.