# The vas-string-metrics Reference Manual

Next: , Previous: , Up: (dir)   [Contents][Index]

# The vas-string-metrics Reference Manual

This is the vas-string-metrics Reference Manual, generated automatically by Declt version 2.4 "Will Decker" on Wed Jun 20 12:44:32 2018 GMT+0.

Next: , Previous: , Up: Top   [Contents][Index]

## 1 Introduction

```vas-string-metrics provides the Jaro, Jaro-Winkler, Soerensen-Dice,
Levenshtein, and normalized Levenshtein string distance/similarity
metrics algorithms.

The Jaro (function jaro-distance), Jaro-Winkler (function
jaro-winkler-distance), Soerensen-Dice (function
soerensen-dice-coefficient) and normalized Levenshtein
(function normalized-levenshtein-distance) algorithms return a
number in the range 0 to 1 indicating how similar two given strings
are - where 0 indicates no similarity, and 1 indicatesa perfect match.

The Jaro-Winkler metric is a heuristic suitable for shorter strings
(such as place and people names), while the Levenshtein distance is
computed as the minimum number of insertions, deletions, or
substitutions needed to transform one string into the other (function
levenshtein-distance).

The Soerensen-Dice coefficient is a statistic suitable for heterogenous
data sets and gives less weight to outliers[1].

for details), except for the unit tests, which are in the public
domain.

[1] https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient#Applications

```

Next: , Previous: , Up: Top   [Contents][Index]

## 2 Systems

The main system appears first, followed by any subsystem dependency.

Previous: , Up: Systems   [Contents][Index]

### 2.1 vas-string-metrics

Author

LLGPLv3

Description

Jaro-Winkler and Levenshtein string distance algorithms.

Source

vas-string-metrics.asd (file)

Components

Next: , Previous: , Up: Top   [Contents][Index]

## 3 Files

Files are sorted by type and then listed depth-first from the systems components trees.

Previous: , Up: Files   [Contents][Index]

### 3.1 Lisp

Next: , Previous: , Up: Lisp files   [Contents][Index]

#### 3.1.1 vas-string-metrics.asd

Location

vas-string-metrics.asd

Systems

vas-string-metrics (system)

#### 3.1.2 vas-string-metrics/package.lisp

Parent

vas-string-metrics (system)

Location

package.lisp

Packages

#### 3.1.3 vas-string-metrics/levenshtein.lisp

Dependency

package.lisp (file)

Parent

vas-string-metrics (system)

Location

levenshtein.lisp

Exported Definitions

#### 3.1.4 vas-string-metrics/jaro-winkler.lisp

Dependency

package.lisp (file)

Parent

vas-string-metrics (system)

Location

jaro-winkler.lisp

Exported Definitions
Internal Definitions

Previous: , Up: Lisp files   [Contents][Index]

#### 3.1.5 vas-string-metrics/soerensen-dice.lisp

Dependency

package.lisp (file)

Parent

vas-string-metrics (system)

Location

soerensen-dice.lisp

Exported Definitions

soerensen-dice-coefficient (function)

Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

## 4 Packages

Packages are listed by definition order.

Previous: , Up: Packages   [Contents][Index]

### 4.1 vas-string-metrics

Source

package.lisp (file)

Use List

common-lisp

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

## 5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.

Next: , Previous: , Up: Definitions   [Contents][Index]

### 5.1 Exported definitions

Previous: , Up: Exported definitions   [Contents][Index]

#### 5.1.1 Functions

Function: jaro-distance S1 S2

Finds the Jaro distance (measure of similarity) from string s1 to string s2. Returns a value in the range from 0 (no similarity) to 1 (exact match).

Package
Source

jaro-winkler.lisp (file)

Function: jaro-winkler-distance S1 S2

Finds the Jaro distance (measure of similarity) from string s1 to string s2. Returns a value in the range from 0 (no similarity) to 1 (exact match).

Package
Source

jaro-winkler.lisp (file)

Function: levenshtein-distance S1 S2

Finds the Levenshtein distance (minimum number of edits) from string s1 to string s2.

Package
Source

levenshtein.lisp (file)

Function: normalized-levenshtein-distance S1 S2

Finds the normalized Levenshtein distance (from 0 for no similarity to 1 for exact match) from string s1 to string s2.

Package
Source

levenshtein.lisp (file)

Function: soerensen-dice-coefficient STRING1 STRING2
Package
Source

soerensen-dice.lisp (file)

Previous: , Up: Definitions   [Contents][Index]

### 5.2 Internal definitions

Previous: , Up: Internal definitions   [Contents][Index]

#### 5.2.1 Functions

Function: bigrams STRING
Package
Source

soerensen-dice.lisp (file)

Function: matching-char-list S1 S2
Package
Source

jaro-winkler.lisp (file)

Function: prefix-length S1 S2
Package
Source

jaro-winkler.lisp (file)

Function: seq-cdr SEQUENCE
Package
Source

soerensen-dice.lisp (file)

Previous: , Up: Top   [Contents][Index]

## Appendix A Indexes

Next: , Previous: , Up: Indexes   [Contents][Index]

### A.1 Concepts

Next: , Previous: , Up: Indexes   [Contents][Index]