class pclTextBox::Distance
sys::Obj pclTextBox::Distance
A collection of algorithms for measuring the distance between two strings.
These distance measures are based on the number of changes needed to make the sequence of characters in one string match that in the other. Different algorithms use different operations.
- hamming
-
static Int hamming(Str word1, Str word2)
Returns a count of the number of different characters in the two words, assuming they are the same size.
An Err is raised if the two words are not of the same size.
- jaccardSimilarity
-
static Float jaccardSimilarity(Str word1, Str word2)
Returns the jaccard similarity measure based on 2-grams of letters in the two words.
- jaccardSimilarityN
-
static Float jaccardSimilarityN(Str word1, Str word2, Int n)
Returns the jaccard similarity measure based on n-grams of letters in the two words.
- levenshtein
-
static Int levenshtein(Str word1, Str word2)
Returns the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
- optimalStringAlignment
-
static Int optimalStringAlignment(Str word1, Str word2)
Returns the minimum number of single-character edits (insertions, deletions, transpositions or substitutions) required to change one word into the other.
- sorensonDiceSimilarity
-
static Float sorensonDiceSimilarity(Str word1, Str word2)
Returns the sorenson_dice_index measure based on 2-grams of letters in the two words.
- sorensonDiceSimilarityN
-
static Float sorensonDiceSimilarityN(Str word1, Str word2, Int n)
Returns the sorenson_dice_index measure based on n-grams of letters in the two words.