# The edit-distance Reference Manual

## Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

# The edit-distance Reference Manual

This is the edit-distance Reference Manual, version 1.0.0, generated automatically by Declt version 2.4 "Will Decker" on Wed Jun 20 11:04:12 2018 GMT+0.

Next: , Previous: , Up: Top   [Contents][Index]

# edit-distance

## Using

Computes the edit distance (aka Levenshtein distance) between two sequences. This is a common distance measure.

For example, to compute the distance between two sequences:

``````(edit-distance:distance '(1 2 3 4 5) '(2 3 4 5 6))
``````

To compute the sequence of edit operations needed to convert sequence 1 into sequence two, use the `diff` function

``````(edit-distance:diff '(1 2 3 4 5) '(2 3 4 5 6))
``````

That will return two values a structure, as follows, and the distance.

``````((:DELETION 1 NIL) (:MATCH 2 2) (:MATCH 3 3) (:MATCH 4 4) (:MATCH 5 5) (:INSERTION NIL 6))
2
``````

That structure can be printed more readibly with the `FORMAT-DIFF` function

``````(edit-distance:format-diff *path*)
``````

Or, you can compute the diff and print it readably together by calling `PRINT-DIFF`:

``````(edit-distance:print-diff '(1 2 3 4 5) '(2 3 4 5 6))
``````

which will print a result like this:

``````seq1: 1   2 3 4 5 *** []
seq2: *** 2 3 4 5 6   []
``````

Several options are available to the `FORMAT-DIFF` and `PRINT-DIFF` to print a prefix and suffix for each line. Note that displaying substitutions relys on captialization and so substitutions are not visible for non-alphabetic sequence elements.

Additionally, other equality functions can be used, so this evaluates to a distance of zero:

`````` (edit-distance:distance '("ONE" "TWO" "THREE") '("one" "two"
"three") :test 'string-equal)
``````

Any type of sequence may be used, but for speed the input sequences are copied into new arrays. If speed is a major concern make sure to provide simple vectors as your input sequences.

## Testing

To test with sbcl, run:

``````sbcl --eval "(asdf:load-system 'edit-distance-test)" --eval "(unless (lisp-unit:run-tests :all :edit-distance-tests) (uiop:quit 1))"
``````

cl-edit-distance by Ben Lambert is licensed under a Creative Commons Attribution 4.0 International License.

Next: , Previous: , Up: Top   [Contents][Index]

## 2 Systems

The main system appears first, followed by any subsystem dependency.

Previous: , Up: Systems   [Contents][Index]

### 2.1 edit-distance

Author

Ben Lambert <belambert@mac.com>

License

CC-BY-4.0

Description

Compute edit distance between sequences.

Version

1.0.0

Source

edit-distance.asd (file)

Component

src (module)

Next: , Previous: , Up: Top   [Contents][Index]

## 3 Modules

Modules are listed depth-first from the system components tree.

Previous: , Up: Modules   [Contents][Index]

### 3.1 edit-distance/src

Parent

edit-distance (system)

Location

src/

Components

Next: , Previous: , Up: Top   [Contents][Index]

## 4 Files

Files are sorted by type and then listed depth-first from the systems components trees.

Previous: , Up: Files   [Contents][Index]

### 4.1 Lisp

Next: , Previous: , Up: Lisp files   [Contents][Index]

#### 4.1.1 edit-distance.asd

Location

edit-distance.asd

Systems

edit-distance (system)

Next: , Previous: , Up: Lisp files   [Contents][Index]

Parent

src (module)

Location

src/package.lisp

Packages

#### 4.1.3 edit-distance/src/distance.lisp

Dependency

package.lisp (file)

Parent

src (module)

Location

src/distance.lisp

Internal Definitions

#### 4.1.4 edit-distance/src/interface.lisp

Dependency

distance.lisp (file)

Parent

src (module)

Location

src/interface.lisp

Exported Definitions
Internal Definitions

Previous: , Up: Lisp files   [Contents][Index]

#### 4.1.5 edit-distance/src/print.lisp

Dependency

interface.lisp (file)

Parent

src (module)

Location

src/print.lisp

Internal Definitions

print-differences (function)

Next: , Previous: , Up: Top   [Contents][Index]

## 5 Packages

Packages are listed by definition order.

Previous: , Up: Packages   [Contents][Index]

### 5.1 edit-distance

Source

package.lisp (file)

Use List

common-lisp

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

## 6 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.

Next: , Previous: , Up: Definitions   [Contents][Index]

### 6.1 Exported definitions

Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.1 Functions

Function: diff S1 S2 &key TEST
Package
Source

interface.lisp (file)

Function: distance S1 S2 &key TEST
Package
Source

interface.lisp (file)

Function: format-diff PATH &key FILE-STREAM PREFIX1 PREFIX2 SUFFIX1 SUFFIX2

Print a human readable ’diff’ of the two sequences to the given stream (standard out by default). Optionally prefixes and suffixes of each line can be printed for easier identification and analysis.

Package
Source

interface.lisp (file)

Function: insertions-and-deletions SEQ1 SEQ2 &key TEST
Package
Source

interface.lisp (file)

Function: print-diff SEQ1 SEQ2 &key FILE-STREAM TEST PREFIX1 PREFIX2 SUFFIX1 SUFFIX2

Print a human readable ’diff’ of the two sequences to the given stream (standard out by default). Optionally prefixes and suffixes of each line can be printed for easier identification and analysis.

Package
Source

interface.lisp (file)

Previous: , Up: Definitions   [Contents][Index]

### 6.2 Internal definitions

Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.1 Functions

Function: compute-alignment S1 S2 &key TEST
Package
Source

interface.lisp (file)

Function: compute-insertions-and-deletions SEQ1 SEQ2 &key TEST

Return a cons that points to two lists. The first is a list of the insertions; The second is a list of deletions.

Package
Source

interface.lisp (file)

Function: edit-distance S1 S2 &key TEST RETURN-PATH
Package
Source

interface.lisp (file)

Function: get-path-from-bp-table BP WIDTH HEIGHT

Given a back-pointer table, return a representation of the shortest path.

Package
Source

distance.lisp (file)

Function: levenshtein-distance S1 S2 &key TEST RETURN-PATH

Compute the Levenshtein distance between two sequences. If return path is t, returns the return path. O/w just returns the distance.

Package
Source

distance.lisp (file)

Function: levenshtein-distance-fast SEQ1 SEQ2 &key TEST

Just like the previous version, but also return the number of *matches*.
This is pretty fast now... It looks like most of the time taken is in applying the test (e.g. string-equal) some non-negligible part of it is taken by the MAKE-ARRAY calls and GC’ing those...

Package
Source

distance.lisp (file)

Function: print-differences PATH &key FILE-STREAM PREFIX1 PREFIX2 SUFFIX1 SUFFIX2

Given a ’path’ as produced by the above function LEVENSTEIN-DISTANCE function above, print the differences in the format of the ’align’ program.
The prefix and suffix args allow the caller to supply a string to print before, and after each.

Package
Source

print.lisp (file)

Function: sequence-diff SEQ1 SEQ2 &key FILE-STREAM TEST PREFIX1 PREFIX2 SUFFIX1 SUFFIX2

Print a human readable ’diff’ of the two sequences to the given stream (standard out by default). Optionally prefixes and suffixes of each line can be printed for easier identification and analysis.

Package
Source

interface.lisp (file)

Previous: , Up: Top   [Contents][Index]

## Appendix A Indexes

Next: , Previous: , Up: Indexes   [Contents][Index]

### A.1 Concepts

Jump to: E   F   L   M
Jump to: E   F   L   M

Next: , Previous: , Up: Indexes   [Contents][Index]

### A.2 Functions

Jump to: C   D   E   F   G   I   L   P   S
Jump to: C   D   E   F   G   I   L   P   S

Next: , Previous: , Up: Indexes   [Contents][Index]

### A.3 Variables

Previous: , Up: Indexes   [Contents][Index]

### A.4 Data types

Jump to: E   P   S
Jump to: E   P   S