The vas-string-metrics Reference Manual

Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

The vas-string-metrics Reference Manual

This is the vas-string-metrics Reference Manual, generated automatically by Declt version 2.4 "Will Decker" on Wed Jun 20 12:44:32 2018 GMT+0.


Next: , Previous: , Up: Top   [Contents][Index]

1 Introduction

vas-string-metrics provides the Jaro, Jaro-Winkler, Soerensen-Dice,
Levenshtein, and normalized Levenshtein string distance/similarity
metrics algorithms.

The Jaro (function jaro-distance), Jaro-Winkler (function
jaro-winkler-distance), Soerensen-Dice (function
soerensen-dice-coefficient) and normalized Levenshtein
(function normalized-levenshtein-distance) algorithms return a
number in the range 0 to 1 indicating how similar two given strings
are - where 0 indicates no similarity, and 1 indicatesa perfect match.

The Jaro-Winkler metric is a heuristic suitable for shorter strings
(such as place and people names), while the Levenshtein distance is
computed as the minimum number of insertions, deletions, or
substitutions needed to transform one string into the other (function
levenshtein-distance).

The Soerensen-Dice coefficient is a statistic suitable for heterogenous
data sets and gives less weight to outliers[1].

The code is distributed under the terms of the LLGPLv3 (see LICENSE
for details), except for the unit tests, which are in the public
domain.

[1] https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient#Applications


Next: , Previous: , Up: Top   [Contents][Index]

2 Systems

The main system appears first, followed by any subsystem dependency.


Previous: , Up: Systems   [Contents][Index]

2.1 vas-string-metrics

Author

Vladimir Sedach <vsedach@gmail.com>

License

LLGPLv3

Description

Jaro-Winkler and Levenshtein string distance algorithms.

Source

vas-string-metrics.asd (file)

Components

Next: , Previous: , Up: Top   [Contents][Index]

3 Files

Files are sorted by type and then listed depth-first from the systems components trees.


Previous: , Up: Files   [Contents][Index]

3.1 Lisp


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.1 vas-string-metrics.asd

Location

vas-string-metrics.asd

Systems

vas-string-metrics (system)


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.2 vas-string-metrics/package.lisp

Parent

vas-string-metrics (system)

Location

package.lisp

Packages

vas-string-metrics


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.3 vas-string-metrics/levenshtein.lisp

Dependency

package.lisp (file)

Parent

vas-string-metrics (system)

Location

levenshtein.lisp

Exported Definitions

Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.4 vas-string-metrics/jaro-winkler.lisp

Dependency

package.lisp (file)

Parent

vas-string-metrics (system)

Location

jaro-winkler.lisp

Exported Definitions
Internal Definitions

Previous: , Up: Lisp files   [Contents][Index]

3.1.5 vas-string-metrics/soerensen-dice.lisp

Dependency

package.lisp (file)

Parent

vas-string-metrics (system)

Location

soerensen-dice.lisp

Exported Definitions

soerensen-dice-coefficient (function)

Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

4 Packages

Packages are listed by definition order.


Previous: , Up: Packages   [Contents][Index]

4.1 vas-string-metrics

Source

package.lisp (file)

Use List

common-lisp

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


Next: , Previous: , Up: Definitions   [Contents][Index]

5.1 Exported definitions


Previous: , Up: Exported definitions   [Contents][Index]

5.1.1 Functions

Function: jaro-distance S1 S2

Finds the Jaro distance (measure of similarity) from string s1 to string s2. Returns a value in the range from 0 (no similarity) to 1 (exact match).

Package

vas-string-metrics

Source

jaro-winkler.lisp (file)

Function: jaro-winkler-distance S1 S2

Finds the Jaro distance (measure of similarity) from string s1 to string s2. Returns a value in the range from 0 (no similarity) to 1 (exact match).

Package

vas-string-metrics

Source

jaro-winkler.lisp (file)

Function: levenshtein-distance S1 S2

Finds the Levenshtein distance (minimum number of edits) from string s1 to string s2.

Package

vas-string-metrics

Source

levenshtein.lisp (file)

Function: normalized-levenshtein-distance S1 S2

Finds the normalized Levenshtein distance (from 0 for no similarity to 1 for exact match) from string s1 to string s2.

Package

vas-string-metrics

Source

levenshtein.lisp (file)

Function: soerensen-dice-coefficient STRING1 STRING2
Package

vas-string-metrics

Source

soerensen-dice.lisp (file)


Previous: , Up: Definitions   [Contents][Index]

5.2 Internal definitions


Previous: , Up: Internal definitions   [Contents][Index]

5.2.1 Functions

Function: bigrams STRING
Package

vas-string-metrics

Source

soerensen-dice.lisp (file)

Function: matching-char-list S1 S2
Package

vas-string-metrics

Source

jaro-winkler.lisp (file)

Function: prefix-length S1 S2
Package

vas-string-metrics

Source

jaro-winkler.lisp (file)

Function: seq-cdr SEQUENCE
Package

vas-string-metrics

Source

soerensen-dice.lisp (file)


Previous: , Up: Top   [Contents][Index]

Appendix A Indexes


Next: , Previous: , Up: Indexes   [Contents][Index]

A.1 Concepts

Jump to:   F   L   V  
Index Entry  Section

F
File, Lisp, vas-string-metrics.asd: The vas-string-metrics<dot>asd file
File, Lisp, vas-string-metrics/jaro-winkler.lisp: The vas-string-metrics/jaro-winkler<dot>lisp file
File, Lisp, vas-string-metrics/levenshtein.lisp: The vas-string-metrics/levenshtein<dot>lisp file
File, Lisp, vas-string-metrics/package.lisp: The vas-string-metrics/package<dot>lisp file
File, Lisp, vas-string-metrics/soerensen-dice.lisp: The vas-string-metrics/soerensen-dice<dot>lisp file

L
Lisp File, vas-string-metrics.asd: The vas-string-metrics<dot>asd file
Lisp File, vas-string-metrics/jaro-winkler.lisp: The vas-string-metrics/jaro-winkler<dot>lisp file
Lisp File, vas-string-metrics/levenshtein.lisp: The vas-string-metrics/levenshtein<dot>lisp file
Lisp File, vas-string-metrics/package.lisp: The vas-string-metrics/package<dot>lisp file
Lisp File, vas-string-metrics/soerensen-dice.lisp: The vas-string-metrics/soerensen-dice<dot>lisp file

V
vas-string-metrics.asd: The vas-string-metrics<dot>asd file
vas-string-metrics/jaro-winkler.lisp: The vas-string-metrics/jaro-winkler<dot>lisp file
vas-string-metrics/levenshtein.lisp: The vas-string-metrics/levenshtein<dot>lisp file
vas-string-metrics/package.lisp: The vas-string-metrics/package<dot>lisp file
vas-string-metrics/soerensen-dice.lisp: The vas-string-metrics/soerensen-dice<dot>lisp file

Jump to:   F   L   V  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.2 Functions

Jump to:   B   F   J   L   M   N   P   S  
Index Entry  Section

B
bigrams: Internal functions

F
Function, bigrams: Internal functions
Function, jaro-distance: Exported functions
Function, jaro-winkler-distance: Exported functions
Function, levenshtein-distance: Exported functions
Function, matching-char-list: Internal functions
Function, normalized-levenshtein-distance: Exported functions
Function, prefix-length: Internal functions
Function, seq-cdr: Internal functions
Function, soerensen-dice-coefficient: Exported functions

J
jaro-distance: Exported functions
jaro-winkler-distance: Exported functions

L
levenshtein-distance: Exported functions

M
matching-char-list: Internal functions

N
normalized-levenshtein-distance: Exported functions

P
prefix-length: Internal functions

S
seq-cdr: Internal functions
soerensen-dice-coefficient: Exported functions

Jump to:   B   F   J   L   M   N   P   S  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.3 Variables


Previous: , Up: Indexes   [Contents][Index]

A.4 Data types

Jump to:   P   S   V  
Index Entry  Section

P
Package, vas-string-metrics: The vas-string-metrics package

S
System, vas-string-metrics: The vas-string-metrics system

V
vas-string-metrics: The vas-string-metrics system
vas-string-metrics: The vas-string-metrics package

Jump to:   P   S   V