Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]
This is the darts.lib.sequence-metrics Reference Manual, version 0.1, generated automatically by Declt version 2.4 "Will Decker" on Wed Jun 20 11:40:12 2018 GMT+0.
• Introduction: | What darts.lib.sequence-metrics is all about | |
• Systems: | The systems documentation | |
• Modules: | The modules documentation | |
• Files: | The files documentation | |
• Packages: | The packages documentation | |
• Definitions: | The symbols documentation | |
• Indexes: | Concepts, functions, variables and data types |
Distance metrics on sequences in general and strings in particular.
jaro-winkler-min-prefix-length
6
jaro-winkler-prefix-adjustment-scale
1/10
hamming-distance s1 s2 &key start1 end1 start2 end2 test test-not key
=> distance
jaro-distance s1 s2 &key start1 end1 start2 end2 test test-not key
=> distance
jaro-winkler-distance s1 s2 &key start1 end1 start2 end2 test test-not key prefix-length adjustment-scale
=> distance
levenshein-distance s1 s2 &key start1 end1 start2 end2 test test-not key
=> distance
list-ngrams size list &key start-padding end-padding transform
=> list
longest-common-subsequence*-length s1 s2 &key start1 end1 start2 end2 test test-not key
=> length
longest-common-subsequence-length s1 s2 &key start1 end1 start2 end2 test test-not key
=> length
longest-common-subsequences* s1 s2 &key start1 end1 start2 end2 test test-not key
=> list
longest-common-substring-length s1 s2 &key start1 end1 start2 end2 case-sensitive
=> length
longest-common-substrings s1 s2 &key start1 end1 start2 end2 case-sensitive
=> list
map-ngrams function size list &key start-padding end-padding
string-hamming-distance s1 s2 &key start1 end1 start2 end2 case-sensitive
=> distance
string-jaro-distance s1 s2 &key start1 end1 start2 end2 case-sensitive
=> distance
string-jaro-winkler-distance s1 s2 &key start1 end1 start2 end2 case-sensitive prefix-length adjustment-scale
=> distance
string-levenshtein-distance s1 s2 &key start1 end1 start2 end2 case-sensitive
=> distance
do-ngrams (&rest vars) (list-form &key start-padding end-padding) &body body
=> undefined
Next: Modules, Previous: Introduction, Up: Top [Contents][Index]
The main system appears first, followed by any subsystem dependency.
• The darts.lib.sequence-metrics system: |
Dirk Esser
Dirk Esser
MIT
Provides various distance metrics on sequences
0.1
src (module)
Modules are listed depth-first from the system components tree.
• The darts.lib.sequence-metrics/src module: |
darts.lib.sequence-metrics (system)
src/
Files are sorted by type and then listed depth-first from the systems components trees.
• Lisp files: |
Next: The darts<dot>lib<dot>sequence-metrics/src/package<dot>lisp file, Previous: Lisp files, Up: Lisp files [Contents][Index]
darts.lib.sequence-metrics.asd
darts.lib.sequence-metrics (system)
Next: The darts<dot>lib<dot>sequence-metrics/src/types<dot>lisp file, Previous: The darts<dot>lib<dot>sequence-metrics<dot>asd file, Up: Lisp files [Contents][Index]
src (module)
src/package.lisp
Next: The darts<dot>lib<dot>sequence-metrics/src/levenshtein<dot>lisp file, Previous: The darts<dot>lib<dot>sequence-metrics/src/package<dot>lisp file, Up: Lisp files [Contents][Index]
package.lisp (file)
src (module)
src/types.lisp
Next: The darts<dot>lib<dot>sequence-metrics/src/hamming<dot>lisp file, Previous: The darts<dot>lib<dot>sequence-metrics/src/types<dot>lisp file, Up: Lisp files [Contents][Index]
types.lisp (file)
src (module)
src/levenshtein.lisp
Next: The darts<dot>lib<dot>sequence-metrics/src/lcs<dot>lisp file, Previous: The darts<dot>lib<dot>sequence-metrics/src/levenshtein<dot>lisp file, Up: Lisp files [Contents][Index]
types.lisp (file)
src (module)
src/hamming.lisp
Next: The darts<dot>lib<dot>sequence-metrics/src/jaro-winkler<dot>lisp file, Previous: The darts<dot>lib<dot>sequence-metrics/src/hamming<dot>lisp file, Up: Lisp files [Contents][Index]
types.lisp (file)
src (module)
src/lcs.lisp
Next: The darts<dot>lib<dot>sequence-metrics/src/ngrams<dot>lisp file, Previous: The darts<dot>lib<dot>sequence-metrics/src/lcs<dot>lisp file, Up: Lisp files [Contents][Index]
types.lisp (file)
src (module)
src/jaro-winkler.lisp
Previous: The darts<dot>lib<dot>sequence-metrics/src/jaro-winkler<dot>lisp file, Up: Lisp files [Contents][Index]
types.lisp (file)
src (module)
src/ngrams.lisp
Next: Definitions, Previous: Files, Up: Top [Contents][Index]
Packages are listed by definition order.
• The darts.asdf package: | ||
• The darts.lib.sequence-metrics package: |
Next: The darts<dot>lib<dot>sequence-metrics package, Previous: Packages, Up: Packages [Contents][Index]
darts.lib.sequence-metrics.asd
Previous: The darts<dot>asdf package, Up: Packages [Contents][Index]
This package exports various forms of metric functions
on sequences. Among the ones provided are:
- Levenshtein distance
- Jaro and Jaro/Winkler distance
- Hamming distance
Most distances are provided in a very general form, working an arbitrary
sequences. However, since most of these distance functions are usually
applied to strings, for some frequently used metrics, optimized string
versions are provided.
This package also exports a few other utility functions, which strictly speaking don’t really belong here, such as the n-gram related stuff. They live in this package, because they used to do so since the dawn of time...
package.lisp (file)
common-lisp
Definitions are sorted by export status, category, package, and then by lexicographic order.
• Exported definitions: | ||
• Internal definitions: |
Next: Internal definitions, Previous: Definitions, Up: Definitions [Contents][Index]
• Exported macros: | ||
• Exported functions: |
Next: Exported functions, Previous: Exported definitions, Up: Exported definitions [Contents][Index]
ngrams.lisp (file)
Previous: Exported macros, Up: Exported definitions [Contents][Index]
hamming.lisp (file)
jaro-distance SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => DISTANCE
jaro-winkler.lisp (file)
jaro-winkler.lisp (file)
levenshtein-distance S1 S2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => NUMBER
Computes the Levenshtein distance between sequences S1 and S2. The result
value DISTANCE is the minimum number of edit operations required to
transform S1 into S2 (or vice versa), where allowed operations are:
- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one
The Levenshtein distance is a measure of similarity between strings. If
two strings have a distance of 0, they are equal. The TEST function is
used to compare sequence elements. The default test function is eql.
This function is a generalization of string-levenshtein-distance
for arbitrary sequence types. Use string-levenshtein-distance, if you
need a version optimized for use with strings.
levenshtein.lisp (file)
ngrams.lisp (file)
longest-common-subsequence*-length SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH
Returns the length of the (or: a) longest common contiguous subsequence
of sequences SEQ1 and SEQ2. This problem is usually called the longest
common ‘substring´ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.
lcs.lisp (file)
lcs.lisp (file)
longest-common-subsequences* SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH
Returns a list containing all contigous longest common subsequences
of SEQ1 and SEQ2. This problem is usually called the longest common
‘substring´ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.
lcs.lisp (file)
lcs.lisp (file)
lcs.lisp (file)
map-ngrams FUNCTION N LIST &key START-PADDING END-PADDING => UNDEFINED
Calls FUNCTION for each n-gram of size N constructed from the elements in the given LIST. Initially, the value of START-PADDING is used to fill the first N - 1 elements in the call. When the list is exhausted, the value of END-PADDING is used to pad to a size of N. The function must accept N arguments.
ngrams.lisp (file)
hamming.lisp (file)
string-jaro-distance STR1 STR2 &key START1 END1 START2 END2 CASE-SENSITIVE => DISTANCE
jaro-winkler.lisp (file)
jaro-winkler.lisp (file)
string-levenshtein-distance S1 S2 &key START1 END1 START2 END2 CASE-SENSITIVE => DISTANCE
Computes the Levenshtein distance between strings S1 and S2. The result
value DISTANCE is the minimum number of edit operations required to
transform S1 into S2 (or vice versa), where allowed operations are:
- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one
The Levenshtein distance is a measure of similarity between strings. If
two strings have a distance of 0, they are equal.
If CASE-SENSITIVE is true (which is the default), then comparing of
characters is done in a case sensitive way, distinguishing between lower
and upper case letters. If CASE-SENSITIVE is false, then this function
does not distinguish between lower case and upper case letters.
See function levenshtein-distance for a generalization of this function to arbitrary sequences.
levenshtein.lisp (file)
Previous: Exported definitions, Up: Definitions [Contents][Index]
• Internal constants: | ||
• Internal types: |
Next: Internal types, Previous: Internal definitions, Up: Internal definitions [Contents][Index]
jaro-winkler.lisp (file)
jaro-winkler.lisp (file)
Previous: Internal constants, Up: Internal definitions [Contents][Index]
Integer type, which is large enough to hold an index into some arbitrary array (in particular, into a string).
types.lisp (file)
Type of sequence metric. This function type is the essential sequence metric function type as is provided by most functions exposed by this package, if they operate on two generic sequences.
types.lisp (file)
Type of string metric. This function type is the essential sequence metric function type as is provided by most functions exposed by this package, if they operate on two actual strings.
types.lisp (file)
Previous: Definitions, Up: Top [Contents][Index]
• Concept index: | ||
• Function index: | ||
• Variable index: | ||
• Data type index: |
Next: Function index, Previous: Indexes, Up: Indexes [Contents][Index]
Jump to: | D F L M |
---|
Jump to: | D F L M |
---|
Next: Variable index, Previous: Concept index, Up: Indexes [Contents][Index]
Jump to: | D F H J L M S |
---|
Jump to: | D F H J L M S |
---|
Next: Data type index, Previous: Function index, Up: Indexes [Contents][Index]
Jump to: | C J |
---|
Jump to: | C J |
---|
Previous: Variable index, Up: Indexes [Contents][Index]
Jump to: | A D P S T |
---|
Jump to: | A D P S T |
---|