This is the darts.lib.sequence-metrics Reference Manual, version 0.1, generated automatically by Declt version 4.0 beta 2 "William Riker" on Mon Feb 26 16:11:57 2024 GMT+0.
darts.lib.sequence-metrics/darts.lib.sequence-metrics.asd
darts.lib.sequence-metrics/src/package.lisp
darts.lib.sequence-metrics/src/types.lisp
darts.lib.sequence-metrics/src/levenshtein.lisp
darts.lib.sequence-metrics/src/hamming.lisp
darts.lib.sequence-metrics/src/lcs.lisp
darts.lib.sequence-metrics/src/jaro-winkler.lisp
darts.lib.sequence-metrics/src/ngrams.lisp
The main system appears first, followed by any subsystem dependency.
darts.lib.sequence-metrics
Provides various distance metrics on sequences
Dirk Esser
Dirk Esser
MIT
0.1
src
(module).
Modules are listed depth-first from the system components tree.
darts.lib.sequence-metrics/src
darts.lib.sequence-metrics
(system).
package.lisp
(file).
types.lisp
(file).
levenshtein.lisp
(file).
hamming.lisp
(file).
lcs.lisp
(file).
jaro-winkler.lisp
(file).
ngrams.lisp
(file).
Files are sorted by type and then listed depth-first from the systems components trees.
darts.lib.sequence-metrics/darts.lib.sequence-metrics.asd
darts.lib.sequence-metrics/src/package.lisp
darts.lib.sequence-metrics/src/types.lisp
darts.lib.sequence-metrics/src/levenshtein.lisp
darts.lib.sequence-metrics/src/hamming.lisp
darts.lib.sequence-metrics/src/lcs.lisp
darts.lib.sequence-metrics/src/jaro-winkler.lisp
darts.lib.sequence-metrics/src/ngrams.lisp
darts.lib.sequence-metrics/darts.lib.sequence-metrics.asd
darts.lib.sequence-metrics
(system).
darts.lib.sequence-metrics/src/types.lisp
package.lisp
(file).
src
(module).
array-index
(type).
sequence-function
(type).
string-function
(type).
darts.lib.sequence-metrics/src/levenshtein.lisp
types.lisp
(file).
src
(module).
levenshtein-distance
(function).
string-levenshtein-distance
(function).
darts.lib.sequence-metrics/src/hamming.lisp
types.lisp
(file).
src
(module).
hamming-distance
(function).
string-hamming-distance
(function).
darts.lib.sequence-metrics/src/lcs.lisp
types.lisp
(file).
src
(module).
longest-common-subsequence*-length
(function).
longest-common-subsequence-length
(function).
longest-common-subsequences*
(function).
longest-common-substring-length
(function).
longest-common-substrings
(function).
darts.lib.sequence-metrics/src/jaro-winkler.lisp
types.lisp
(file).
src
(module).
jaro-distance
(function).
jaro-winkler-distance
(function).
string-jaro-distance
(function).
string-jaro-winkler-distance
(function).
jaro-winkler-min-prefix-length
(constant).
jaro-winkler-prefix-adjustment-scale
(constant).
darts.lib.sequence-metrics/src/ngrams.lisp
types.lisp
(file).
src
(module).
do-ngrams
(macro).
list-ngrams
(function).
map-ngrams
(function).
Packages are listed by definition order.
darts.lib.sequence-metrics
This package exports various forms of metric functions
on sequences. Among the ones provided are:
- Levenshtein distance
- Jaro and Jaro/Winkler distance
- Hamming distance
Most distances are provided in a very general form, working an arbitrary
sequences. However, since most of these distance functions are usually
applied to strings, for some frequently used metrics, optimized string
versions are provided.
This package also exports a few other utility functions, which strictly speaking don’t really belong here, such as the n-gram related stuff. They live in this package, because they used to do so since the dawn of time...
common-lisp
.
do-ngrams
(macro).
hamming-distance
(function).
jaro-distance
(function).
jaro-winkler-distance
(function).
levenshtein-distance
(function).
list-ngrams
(function).
longest-common-subsequence*-length
(function).
longest-common-subsequence-length
(function).
longest-common-subsequences*
(function).
longest-common-substring-length
(function).
longest-common-substrings
(function).
map-ngrams
(function).
string-hamming-distance
(function).
string-jaro-distance
(function).
string-jaro-winkler-distance
(function).
string-levenshtein-distance
(function).
array-index
(type).
jaro-winkler-min-prefix-length
(constant).
jaro-winkler-prefix-adjustment-scale
(constant).
sequence-function
(type).
string-function
(type).
Definitions are sorted by export status, category, package, and then by lexicographic order.
jaro-distance SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => DISTANCE
levenshtein-distance S1 S2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => NUMBER
Computes the Levenshtein distance between sequences S1 and S2. The result
value DISTANCE is the minimum number of edit operations required to
transform S1 into S2 (or vice versa), where allowed operations are:
- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one
The Levenshtein distance is a measure of similarity between strings. If
two strings have a distance of 0, they are equal. The TEST function is
used to compare sequence elements. The default test function is eql.
This function is a generalization of string-levenshtein-distance
for arbitrary sequence types. Use string-levenshtein-distance, if you
need a version optimized for use with strings.
longest-common-subsequence*-length SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH
Returns the length of the (or: a) longest common contiguous subsequence
of sequences SEQ1 and SEQ2. This problem is usually called the longest
common ‘substring´ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.
longest-common-subsequences* SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH
Returns a list containing all contigous longest common subsequences
of SEQ1 and SEQ2. This problem is usually called the longest common
‘substring´ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.
map-ngrams FUNCTION N LIST &key START-PADDING END-PADDING => UNDEFINED
Calls FUNCTION for each n-gram of size N constructed from the elements in the given LIST. Initially, the value of START-PADDING is used to fill the first N - 1 elements in the call. When the list is exhausted, the value of END-PADDING is used to pad to a size of N. The function must accept N arguments.
string-jaro-distance STR1 STR2 &key START1 END1 START2 END2 CASE-SENSITIVE => DISTANCE
string-levenshtein-distance S1 S2 &key START1 END1 START2 END2 CASE-SENSITIVE => DISTANCE
Computes the Levenshtein distance between strings S1 and S2. The result
value DISTANCE is the minimum number of edit operations required to
transform S1 into S2 (or vice versa), where allowed operations are:
- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one
The Levenshtein distance is a measure of similarity between strings. If
two strings have a distance of 0, they are equal.
If CASE-SENSITIVE is true (which is the default), then comparing of
characters is done in a case sensitive way, distinguishing between lower
and upper case letters. If CASE-SENSITIVE is false, then this function
does not distinguish between lower case and upper case letters.
See function levenshtein-distance for a generalization of this function to arbitrary sequences.
Integer type, which is large enough to hold an index into some arbitrary array (in particular, into a string).
Type of sequence metric. This function type is the essential sequence metric function type as is provided by most functions exposed by this package, if they operate on two generic sequences.
Type of string metric. This function type is the essential sequence metric function type as is provided by most functions exposed by this package, if they operate on two actual strings.
Jump to: | D F H J L M S |
---|
Jump to: | D F H J L M S |
---|
Jump to: | C J |
---|
Jump to: | C J |
---|
Jump to: | A D F H J L M N P S T |
---|
Jump to: | A D F H J L M N P S T |
---|