nachos.similarity_functions package

Submodules

nachos.similarity_functions.SimilarityFunctions module

class nachos.similarity_functions.SimilarityFunctions.SimilarityFunctions(fns, weights)[source]

Bases: object

classmethod build(conf)[source]
__init__(fns, weights)[source]
__call__(u, v, n=None)[source]
Summary:

This function is overloaded to operate with a few different kinds of data. It can either work to compare the similarities between two data points, between a data point and a dataset, or either of the previous two functions with respect to a single factor, n.

Parameters
  • u (Dataset) – A data point (defined by the Dataset class)

  • v (Dataset) – A data set

  • n (Optional[int]) – The index of the factor with respect to which to compute similarity. None means use the sum of all factors

Returns

The similarity score

Return type

float

score(u, v, n)[source]
Return type

float

score_set(u, v, n=None)[source]
Return type

float

nachos.similarity_functions.abstract_similarity module

class nachos.similarity_functions.abstract_similarity.AbstractSimilarity[source]

Bases: abc.ABC

abstract classmethod build(conf)[source]
__init__()[source]
abstract __call__(f, g)[source]

Call self as a function.

Return type

float

nachos.similarity_functions.boolean module

class nachos.similarity_functions.boolean.Boolean[source]

Bases: nachos.similarity_functions.abstract_similarity.AbstractSimilarity

Summary:

This class defines the boolean similarity between points. It assumes the points are categorical. It is overloaded to allow for sets of inputs, in which case the similarity (True or False) is decided by examining whether any element in the set is equal to any element in the other set.

classmethod build(conf)[source]
__call__(f, g)[source]
Summary:

Computes the similarity bewtween f and g. Similarity is binary a binary value. f, g can be any object though the intention is for them to be categorical values that can be compared for equality.

Parameters
  • f (Any) – a value (categorical) to be compared

  • g (Any) – a value (categorical) to be compared

Returns

the boolean similarity between f and g

Return type

bool

nachos.similarity_functions.cosine module

class nachos.similarity_functions.cosine.Cosine(t)[source]

Bases: nachos.similarity_functions.abstract_similarity.AbstractSimilarity

Summary:

Defines the (thresholded) cosine similarity between two points. Each points are expected to be ndarrays. The cosine similarity is computed using the sklearn pairwise metrics package. If all pairwise distances are desired, then the ndarray can be Nxd, where N specifies the number of data points.

Using N > 1 is useful when defining similarities on sets, which this similarity function is automatically designed to do. It returns the largest pairwise similarity between any elements of the sets being compared.

classmethod build(conf)[source]
__init__(t)[source]
__call__(f, g)[source]
Summary:

Computes the thresholded cosine similarity between inputs f, g. f, g are assumed to be real valued vectors, generally representing embeddings which have been whitened.

Parameters
  • f (set) – an ndarray representing a set of vectors to compare

  • g (set) – an ndarray representing a set of vectors to compare

Returns

returns the similarity score

Return type

float

nachos.similarity_functions.gaussian module

class nachos.similarity_functions.gaussian.Gaussian(t)[source]

Bases: nachos.similarity_functions.abstract_similarity.AbstractSimilarity

classmethod build(conf)[source]
__init__(t)[source]
__call__(f, g)[source]
Summary:

Computes the thresholded similarity score between inputs f, g. f, g are assumed to be real valued scalars, and the similarity is the Gaussian similarity between the values assuming unit variance.

Parameters
  • f (float) – a float representing a real value to compare

  • g (float) – a float representing a real value to compare

Returns

returns the similarity score

Return type

float

nachos.similarity_functions.set_intersection module

class nachos.similarity_functions.set_intersection.SetIntersection[source]

Bases: nachos.similarity_functions.abstract_similarity.AbstractSimilarity

classmethod build(conf)[source]
__call__(f, g)[source]
Summary:

Computes the similarity between inputs f and g. f, g are assumed to be multi-valued objects, i.e., represent sets of values. We use the size of the intersection of the elements as the similiarity.

Parameters
  • f (Union[Any, Iterable] I.e., a set or something which can be converted to a set) – a value to compare

  • g (Union[Any, Iterable] I.e., a set or something which can be converted to a set) – a value to compare

Returns

returns the similarity score

Return type

float

Module contents

nachos.similarity_functions.register(name)[source]
nachos.similarity_functions.build_similarity_functions(conf)[source]