The darts.lib.sequence-metrics Reference Manual

Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

The darts.lib.sequence-metrics Reference Manual

This is the darts.lib.sequence-metrics Reference Manual, version 0.1, generated automatically by Declt version 2.4 "Will Decker" on Wed Jun 20 11:40:12 2018 GMT+0.


Next: , Previous: , Up: Top   [Contents][Index]

1 Introduction

DartsCLSequenceMetrics

Distance metrics on sequences in general and strings in particular.


Next: , Previous: , Up: Top   [Contents][Index]

2 Systems

The main system appears first, followed by any subsystem dependency.


Previous: , Up: Systems   [Contents][Index]

2.1 darts.lib.sequence-metrics

Maintainer

Dirk Esser

Author

Dirk Esser

License

MIT

Description

Provides various distance metrics on sequences

Long Description
Version

0.1

Source

darts.lib.sequence-metrics.asd (file)

Component

src (module)


Next: , Previous: , Up: Top   [Contents][Index]

3 Modules

Modules are listed depth-first from the system components tree.


Previous: , Up: Modules   [Contents][Index]

3.1 darts.lib.sequence-metrics/src

Parent

darts.lib.sequence-metrics (system)

Location

src/

Components

Next: , Previous: , Up: Top   [Contents][Index]

4 Files

Files are sorted by type and then listed depth-first from the systems components trees.


Previous: , Up: Files   [Contents][Index]

4.1 Lisp


Next: , Previous: , Up: Lisp files   [Contents][Index]

4.1.1 darts.lib.sequence-metrics.asd

Location

darts.lib.sequence-metrics.asd

Systems

darts.lib.sequence-metrics (system)

Packages

darts.asdf


Next: , Previous: , Up: Lisp files   [Contents][Index]

4.1.2 darts.lib.sequence-metrics/src/package.lisp

Parent

src (module)

Location

src/package.lisp

Packages

darts.lib.sequence-metrics


Next: , Previous: , Up: Lisp files   [Contents][Index]

4.1.3 darts.lib.sequence-metrics/src/types.lisp

Dependency

package.lisp (file)

Parent

src (module)

Location

src/types.lisp

Internal Definitions

Next: , Previous: , Up: Lisp files   [Contents][Index]

4.1.4 darts.lib.sequence-metrics/src/levenshtein.lisp

Dependency

types.lisp (file)

Parent

src (module)

Location

src/levenshtein.lisp

Exported Definitions

Next: , Previous: , Up: Lisp files   [Contents][Index]

4.1.5 darts.lib.sequence-metrics/src/hamming.lisp

Dependency

types.lisp (file)

Parent

src (module)

Location

src/hamming.lisp

Exported Definitions

Next: , Previous: , Up: Lisp files   [Contents][Index]

4.1.6 darts.lib.sequence-metrics/src/lcs.lisp

Dependency

types.lisp (file)

Parent

src (module)

Location

src/lcs.lisp

Exported Definitions

Next: , Previous: , Up: Lisp files   [Contents][Index]

4.1.7 darts.lib.sequence-metrics/src/jaro-winkler.lisp

Dependency

types.lisp (file)

Parent

src (module)

Location

src/jaro-winkler.lisp

Exported Definitions
Internal Definitions

Previous: , Up: Lisp files   [Contents][Index]

4.1.8 darts.lib.sequence-metrics/src/ngrams.lisp

Dependency

types.lisp (file)

Parent

src (module)

Location

src/ngrams.lisp

Exported Definitions

Next: , Previous: , Up: Top   [Contents][Index]

5 Packages

Packages are listed by definition order.


Next: , Previous: , Up: Packages   [Contents][Index]

5.1 darts.asdf

Source

darts.lib.sequence-metrics.asd

Use List

Previous: , Up: Packages   [Contents][Index]

5.2 darts.lib.sequence-metrics

This package exports various forms of metric functions
on sequences. Among the ones provided are:

- Levenshtein distance
- Jaro and Jaro/Winkler distance
- Hamming distance

Most distances are provided in a very general form, working an arbitrary sequences. However, since most of these distance functions are usually applied to strings, for some frequently used metrics, optimized string versions are provided.

This package also exports a few other utility functions, which strictly speaking don’t really belong here, such as the n-gram related stuff. They live in this package, because they used to do so since the dawn of time...

Source

package.lisp (file)

Use List

common-lisp

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

6 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


Next: , Previous: , Up: Definitions   [Contents][Index]

6.1 Exported definitions


Next: , Previous: , Up: Exported definitions   [Contents][Index]

6.1.1 Macros

Macro: do-ngrams (&rest VARS) (LIST-FORM &key START-PADDING END-PADDING) &body BODY
Package

darts.lib.sequence-metrics

Source

ngrams.lisp (file)


Previous: , Up: Exported definitions   [Contents][Index]

6.1.2 Functions

Function: hamming-distance SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY NORMALIZED
Package

darts.lib.sequence-metrics

Source

hamming.lisp (file)

Function: jaro-distance STR1 STR2 &key START1 END1 START2 END2 TEST TEST-NOT KEY

jaro-distance SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => DISTANCE

Package

darts.lib.sequence-metrics

Source

jaro-winkler.lisp (file)

Function: jaro-winkler-distance STR1 STR2 &key START1 END1 START2 END2 TEST TEST-NOT KEY PREFIX-LENGTH ADJUSTMENT-SCALE
Package

darts.lib.sequence-metrics

Source

jaro-winkler.lisp (file)

Function: levenshtein-distance ()

levenshtein-distance S1 S2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => NUMBER

Computes the Levenshtein distance between sequences S1 and S2. The result
value DISTANCE is the minimum number of edit operations required to
transform S1 into S2 (or vice versa), where allowed operations are:

- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one

The Levenshtein distance is a measure of similarity between strings. If
two strings have a distance of 0, they are equal. The TEST function is
used to compare sequence elements. The default test function is eql.

This function is a generalization of string-levenshtein-distance
for arbitrary sequence types. Use string-levenshtein-distance, if you
need a version optimized for use with strings.

Package

darts.lib.sequence-metrics

Source

levenshtein.lisp (file)

Function: list-ngrams SIZE LIST &key START-PADDING END-PADDING TRANSFORM
Package

darts.lib.sequence-metrics

Source

ngrams.lisp (file)

Function: longest-common-subsequence*-length SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY

longest-common-subsequence*-length SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH

Returns the length of the (or: a) longest common contiguous subsequence
of sequences SEQ1 and SEQ2. This problem is usually called the longest
common ‘substring┬┤ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.

Package

darts.lib.sequence-metrics

Source

lcs.lisp (file)

Function: longest-common-subsequence-length SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY
Package

darts.lib.sequence-metrics

Source

lcs.lisp (file)

Function: longest-common-subsequences* SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY

longest-common-subsequences* SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH

Returns a list containing all contigous longest common subsequences
of SEQ1 and SEQ2. This problem is usually called the longest common
‘substring┬┤ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.

Package

darts.lib.sequence-metrics

Source

lcs.lisp (file)

Function: longest-common-substring-length SEQ1 SEQ2 &key START1 END1 START2 END2 CASE-SENSITIVE
Package

darts.lib.sequence-metrics

Source

lcs.lisp (file)

Function: longest-common-substrings SEQ1 SEQ2 &key START1 END1 START2 END2 CASE-SENSITIVE
Package

darts.lib.sequence-metrics

Source

lcs.lisp (file)

Function: map-ngrams FUNCTION SIZE LIST &key START-PADDING END-PADDING

map-ngrams FUNCTION N LIST &key START-PADDING END-PADDING => UNDEFINED

Calls FUNCTION for each n-gram of size N constructed from the elements in the given LIST. Initially, the value of START-PADDING is used to fill the first N - 1 elements in the call. When the list is exhausted, the value of END-PADDING is used to pad to a size of N. The function must accept N arguments.

Package

darts.lib.sequence-metrics

Source

ngrams.lisp (file)

Function: string-hamming-distance SEQ1 SEQ2 &key START1 END1 START2 END2 CASE-SENSITIVE NORMALIZED
Package

darts.lib.sequence-metrics

Source

hamming.lisp (file)

Function: string-jaro-distance ()

string-jaro-distance STR1 STR2 &key START1 END1 START2 END2 CASE-SENSITIVE => DISTANCE

Package

darts.lib.sequence-metrics

Source

jaro-winkler.lisp (file)

Function: string-jaro-winkler-distance ()
Package

darts.lib.sequence-metrics

Source

jaro-winkler.lisp (file)

Function: string-levenshtein-distance ()

string-levenshtein-distance S1 S2 &key START1 END1 START2 END2 CASE-SENSITIVE => DISTANCE

Computes the Levenshtein distance between strings S1 and S2. The result value DISTANCE is the minimum number of edit operations required to transform S1 into S2 (or vice versa), where allowed operations are:

- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one

The Levenshtein distance is a measure of similarity between strings. If two strings have a distance of 0, they are equal.

If CASE-SENSITIVE is true (which is the default), then comparing of characters is done in a case sensitive way, distinguishing between lower and upper case letters. If CASE-SENSITIVE is false, then this function does not distinguish between lower case and upper case letters.

See function levenshtein-distance for a generalization of this function to arbitrary sequences.

Package

darts.lib.sequence-metrics

Source

levenshtein.lisp (file)


Previous: , Up: Definitions   [Contents][Index]

6.2 Internal definitions


Next: , Previous: , Up: Internal definitions   [Contents][Index]

6.2.1 Constants

Constant: jaro-winkler-min-prefix-length
Package

darts.lib.sequence-metrics

Source

jaro-winkler.lisp (file)

Constant: jaro-winkler-prefix-adjustment-scale
Package

darts.lib.sequence-metrics

Source

jaro-winkler.lisp (file)


Previous: , Up: Internal definitions   [Contents][Index]

6.2.2 Types

Type: array-index ()

Integer type, which is large enough to hold an index into some arbitrary array (in particular, into a string).

Package

darts.lib.sequence-metrics

Source

types.lisp (file)

Type: sequence-function RESULT &rest ADDITIONAL-KEYS

Type of sequence metric. This function type is the essential sequence metric function type as is provided by most functions exposed by this package, if they operate on two generic sequences.

Package

darts.lib.sequence-metrics

Source

types.lisp (file)

Type: string-function RESULT &rest ADDITIONAL-KEYS

Type of string metric. This function type is the essential sequence metric function type as is provided by most functions exposed by this package, if they operate on two actual strings.

Package

darts.lib.sequence-metrics

Source

types.lisp (file)


Previous: , Up: Top   [Contents][Index]

Appendix A Indexes


Next: , Previous: , Up: Indexes   [Contents][Index]

A.1 Concepts

Jump to:   D   F   L   M  
Index Entry  Section

D
darts.lib.sequence-metrics.asd: The darts<dot>lib<dot>sequence-metrics<dot>asd file
darts.lib.sequence-metrics/src: The darts<dot>lib<dot>sequence-metrics/src module
darts.lib.sequence-metrics/src/hamming.lisp: The darts<dot>lib<dot>sequence-metrics/src/hamming<dot>lisp file
darts.lib.sequence-metrics/src/jaro-winkler.lisp: The darts<dot>lib<dot>sequence-metrics/src/jaro-winkler<dot>lisp file
darts.lib.sequence-metrics/src/lcs.lisp: The darts<dot>lib<dot>sequence-metrics/src/lcs<dot>lisp file
darts.lib.sequence-metrics/src/levenshtein.lisp: The darts<dot>lib<dot>sequence-metrics/src/levenshtein<dot>lisp file
darts.lib.sequence-metrics/src/ngrams.lisp: The darts<dot>lib<dot>sequence-metrics/src/ngrams<dot>lisp file
darts.lib.sequence-metrics/src/package.lisp: The darts<dot>lib<dot>sequence-metrics/src/package<dot>lisp file
darts.lib.sequence-metrics/src/types.lisp: The darts<dot>lib<dot>sequence-metrics/src/types<dot>lisp file

F
File, Lisp, darts.lib.sequence-metrics.asd: The darts<dot>lib<dot>sequence-metrics<dot>asd file
File, Lisp, darts.lib.sequence-metrics/src/hamming.lisp: The darts<dot>lib<dot>sequence-metrics/src/hamming<dot>lisp file
File, Lisp, darts.lib.sequence-metrics/src/jaro-winkler.lisp: The darts<dot>lib<dot>sequence-metrics/src/jaro-winkler<dot>lisp file
File, Lisp, darts.lib.sequence-metrics/src/lcs.lisp: The darts<dot>lib<dot>sequence-metrics/src/lcs<dot>lisp file
File, Lisp, darts.lib.sequence-metrics/src/levenshtein.lisp: The darts<dot>lib<dot>sequence-metrics/src/levenshtein<dot>lisp file
File, Lisp, darts.lib.sequence-metrics/src/ngrams.lisp: The darts<dot>lib<dot>sequence-metrics/src/ngrams<dot>lisp file
File, Lisp, darts.lib.sequence-metrics/src/package.lisp: The darts<dot>lib<dot>sequence-metrics/src/package<dot>lisp file
File, Lisp, darts.lib.sequence-metrics/src/types.lisp: The darts<dot>lib<dot>sequence-metrics/src/types<dot>lisp file

L
Lisp File, darts.lib.sequence-metrics.asd: The darts<dot>lib<dot>sequence-metrics<dot>asd file
Lisp File, darts.lib.sequence-metrics/src/hamming.lisp: The darts<dot>lib<dot>sequence-metrics/src/hamming<dot>lisp file
Lisp File, darts.lib.sequence-metrics/src/jaro-winkler.lisp: The darts<dot>lib<dot>sequence-metrics/src/jaro-winkler<dot>lisp file
Lisp File, darts.lib.sequence-metrics/src/lcs.lisp: The darts<dot>lib<dot>sequence-metrics/src/lcs<dot>lisp file
Lisp File, darts.lib.sequence-metrics/src/levenshtein.lisp: The darts<dot>lib<dot>sequence-metrics/src/levenshtein<dot>lisp file
Lisp File, darts.lib.sequence-metrics/src/ngrams.lisp: The darts<dot>lib<dot>sequence-metrics/src/ngrams<dot>lisp file
Lisp File, darts.lib.sequence-metrics/src/package.lisp: The darts<dot>lib<dot>sequence-metrics/src/package<dot>lisp file
Lisp File, darts.lib.sequence-metrics/src/types.lisp: The darts<dot>lib<dot>sequence-metrics/src/types<dot>lisp file

M
Module, darts.lib.sequence-metrics/src: The darts<dot>lib<dot>sequence-metrics/src module

Jump to:   D   F   L   M  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.2 Functions

Jump to:   D   F   H   J   L   M   S  
Index Entry  Section

D
do-ngrams: Exported macros

F
Function, hamming-distance: Exported functions
Function, jaro-distance: Exported functions
Function, jaro-winkler-distance: Exported functions
Function, levenshtein-distance: Exported functions
Function, list-ngrams: Exported functions
Function, longest-common-subsequence*-length: Exported functions
Function, longest-common-subsequence-length: Exported functions
Function, longest-common-subsequences*: Exported functions
Function, longest-common-substring-length: Exported functions
Function, longest-common-substrings: Exported functions
Function, map-ngrams: Exported functions
Function, string-hamming-distance: Exported functions
Function, string-jaro-distance: Exported functions
Function, string-jaro-winkler-distance: Exported functions
Function, string-levenshtein-distance: Exported functions

H
hamming-distance: Exported functions

J
jaro-distance: Exported functions
jaro-winkler-distance: Exported functions

L
levenshtein-distance: Exported functions
list-ngrams: Exported functions
longest-common-subsequence*-length: Exported functions
longest-common-subsequence-length: Exported functions
longest-common-subsequences*: Exported functions
longest-common-substring-length: Exported functions
longest-common-substrings: Exported functions

M
Macro, do-ngrams: Exported macros
map-ngrams: Exported functions

S
string-hamming-distance: Exported functions
string-jaro-distance: Exported functions
string-jaro-winkler-distance: Exported functions
string-levenshtein-distance: Exported functions

Jump to:   D   F   H   J   L   M   S  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.3 Variables

Jump to:   C   J  
Index Entry  Section

C
Constant, jaro-winkler-min-prefix-length: Internal constants
Constant, jaro-winkler-prefix-adjustment-scale: Internal constants

J
jaro-winkler-min-prefix-length: Internal constants
jaro-winkler-prefix-adjustment-scale: Internal constants

Jump to:   C   J  

Previous: , Up: Indexes   [Contents][Index]

A.4 Data types

Jump to:   A   D   P   S   T  
Index Entry  Section

A
array-index: Internal types

D
darts.asdf: The darts<dot>asdf package
darts.lib.sequence-metrics: The darts<dot>lib<dot>sequence-metrics system
darts.lib.sequence-metrics: The darts<dot>lib<dot>sequence-metrics package

P
Package, darts.asdf: The darts<dot>asdf package
Package, darts.lib.sequence-metrics: The darts<dot>lib<dot>sequence-metrics package

S
sequence-function: Internal types
string-function: Internal types
System, darts.lib.sequence-metrics: The darts<dot>lib<dot>sequence-metrics system

T
Type, array-index: Internal types
Type, sequence-function: Internal types
Type, string-function: Internal types

Jump to:   A   D   P   S   T