# The cl-string-match Reference Manual

Next: , Previous: , Up: (dir)   [Contents][Index]

# The cl-string-match Reference Manual

This is the cl-string-match Reference Manual, version 2018.1.2, generated automatically by Declt version 2.4 "Will Decker" on Wed Jun 20 11:24:52 2018 GMT+0.

Next: , Previous: , Up: Top   [Contents][Index]

## 1 Introduction

CL-STRING-MATCH aims at providing robust implementations of string matching algorithms. These algorithms are also called "substring search" or "subsequence search" algorithms.

Currently it provides implementations of the following string matching algorithms (see wiki for details):

• Brute-force (also known as naïve algorithm)
• Boyer-Moore (with mismatched character heuristic and good suffix shift)
• Boyer-Moore-Horspool algorithm
• Knuth-Morris-Pratt algorithm
• Rabin-Karp algorithm
• Shift-OR algorithm (single pattern)
• Aho-Corasick algorithm (with finite set of patterns, forward transition and fail function)
• A simple backtracking regular expressions engine re similar to that of the Lua programming language. At the moment it significantly underperforms compared to the CL-PPCRE.

Some string processing algorithms are also implemented:

• Simple (naїve) suffix tree construction algorithm
• Ukkonen's suffix tree construction algorithm

Data structures:

• Prefix trie
• Suffix tree

Utilities:

• Testing whether a string has the given suffix or prefix (starts with or ends with the pattern)

Some algorithms (Brute-force, Boyer-Moore-Horspool) have parametric implementations (code templates) making it possible to declare specific implementations for application-specific custom data types and data structures.

This library is routinely tested on Steel Bank CL, Clozure CL, Embeddable CL and Armed Bear CL. Chances are really high that it will work on other platforms without problems (check its status on CL-TEST-GRID).

Check the API Reference for more details.

# RATIONALE

Since the standard `search` function is working fine, one might ask: why do we need a yet another implementation? Answer is simple: advanced algorithms offer different benefits compared to the standard implementation that is based on the brute-force algorithm.

Benchmarks show that depending on environment and pattern of application, a Boyer-Moore-Horspool algorithm implementation can outperform standard search function in SBCL by almost 18 times! Check the code in the `bench` folder for further details.

# USAGE

CL-STRING-MATCH is supported by Quicklisp and is known by its system name:

``````(ql:quickload :cl-string-match)
``````

CL-STRING-MATCH exports functions in `cl-string-match` package (that is also nicknamed as `sm`).

Shortcut functions search given pattern `pat` in text `txt`. They are usually much slower (because they build index structures every time they are called) but are easier to use:

• `string-contains-brute` pat txt — Brute-force
• `string-contains-bm` pat txt — Boyer-Moore
• `string-contains-bmh` pat txt — Boyer-Moore-Horspool
• `string-contains-kmp` pat txt — Knuth-Morris-Pratt
• `string-contains-ac` pat txt — Aho-Corasick
• `string-contains-rk` pat txt — Rabin-Karp

A more robust approach is to use pre-calculated index data that is processed by a pair of `initialize` and `search` functions:

• `initialize-bm` pat and `search-bm` bm txt
• `initialize-bmh` pat and `search-bmh` bm txt
• `initialize-bmh8` pat and `search-bmh8` bm txt
• `initialize-rk` pat and `search-rk` rk txt
• `initialize-kmp` pat and `search-kmp` kmp txt
• `initialize-ac` pat and `search-ac` ac txt. `initialize-ac` can accept a list of patterns that are compiled into a trie.

Brute-force algorithm does not use pre-calculated data and has no "initialize" function.

Boyer-Moore-Horspool implementation (the `-BMH` and `-BMH8` functions) also accepts `:start2` and `:end2` keywords for the "search" and "contains" functions.

Following example looks for a given substring pat in a given line of text txt using Boyer-Moore-Horspool algorithm implementation:

``````(let ((idx (initialize-bmh "abc")))
(search-bmh idx "ababcfbgsldkj"))
``````

Counting all matches of a given pattern in a string:

``````(loop with str = "____abc____abc____ab"
with pat = "abc"
with idx = (sm:initialize-bmh8 pat)
with z = 0 with s = 0 while s do
(when (setf s (sm:search-bmh8 idx str :start2 s))
(incf z) (incf s (length pat)))
finally (return z))
``````

It should be noted that Boyer-Moore-Horspool (`bmh`) implementation can offer an order of magnitude boost to performance compared to the standard `search` function.

However, some implementations create a "jump table" that can be the size of the alphabet (over 1M CHAR-CODE-LIMIT on implementations supporting Unicode) and thus consume a significant chunk of memory. There are different solutions to this problem and at the moment a version for the ASCII strings is offered: `initialize-bmh8` pat and `search-bmh8` bm txt as well as `string-contains-bmh8` pat txt work for strings with characters inside the 256 char code limit.

# CONTRIB

This project also contains code that is not directly invloved with the pattern search algorithms but nevertheless might be found useful for text handling/processing. Check the contrib folder in the repository for more details. Currently it contains:

• `ascii-strings.lisp` aims to provide single-byte strings functionality for Unicode-enabled Common Lisp implementations. Another goal is to reduce memory footprint and boost performance of the string-processing tasks, i.e. `read-line`.

• `simple-scanf` implements a subset of the original POSIX standard `scanf(3)` function features.

# TODO

The project still lacks some important features and is under constant development. Any kind of contributions or feedback are welcome.

Please take a look at the list of open issues or the Project Roadmap.

Next: , Previous: , Up: Top   [Contents][Index]

## 2 Systems

The main system appears first, followed by any subsystem dependency.

Next: , Previous: , Up: Systems   [Contents][Index]

### 2.1 cl-string-match

Author

Vityok https://bitbucket.org/vityok

BSD

Description

Provides implementations of the standard sub-string search (string matching) algorithms: brute-force, Boyer-Moore, Rabin-Karp, etc.

Version

2018.1.2

Dependencies
• alexandria
• ascii-strings (system)
• yacc
• jpl-queues
• iterate
• mgl-pax
Source

cl-string-match.asd (file)

Component

src (module)

Previous: , Up: Systems   [Contents][Index]

### 2.2 ascii-strings

Author

Vityok https://bitbucket.org/vityok

BSD

Description

Operations on ASCII strings. Essentially this can be any kind of single-byte encoded strings.

Version

2015.9.16

Dependencies
• alexandria
• babel
Source

ascii-strings.asd (file)

Component

contrib (module)

Next: , Previous: , Up: Top   [Contents][Index]

## 3 Modules

Modules are listed depth-first from the system components tree.

Next: , Previous: , Up: Modules   [Contents][Index]

### 3.1 cl-string-match/src

Parent

cl-string-match (system)

Location

src/

Components

Previous: , Up: Modules   [Contents][Index]

### 3.2 ascii-strings/contrib

Parent

ascii-strings (system)

Location

contrib/

Component

ascii-strings.lisp (file)

Next: , Previous: , Up: Top   [Contents][Index]

## 4 Files

Files are sorted by type and then listed depth-first from the systems components trees.

Previous: , Up: Files   [Contents][Index]

### 4.1 Lisp

Next: , Previous: , Up: Lisp files   [Contents][Index]

#### 4.1.1 cl-string-match.asd

Location

cl-string-match.asd

Systems

cl-string-match (system)

Packages

Next: , Previous: , Up: Lisp files   [Contents][Index]

#### 4.1.2 ascii-strings.asd

Location

ascii-strings.asd

Systems

ascii-strings (system)

Packages

Next: , Previous: , Up: Lisp files   [Contents][Index]

#### 4.1.3 cl-string-match/src/package.lisp

Parent

src (module)

Location

src/package.lisp

Packages
Internal Definitions

#### 4.1.4 cl-string-match/src/utils.lisp

Dependency

package.lisp (file)

Parent

src (module)

Location

src/utils.lisp

Exported Definitions

#### 4.1.5 cl-string-match/src/brute-force.lisp

Dependency

utils.lisp (file)

Parent

src (module)

Location

src/brute-force.lisp

Exported Definitions

#### 4.1.6 cl-string-match/src/boyer-moore.lisp

Dependency

brute-force.lisp (file)

Parent

src (module)

Location

src/boyer-moore.lisp

Exported Definitions
Internal Definitions

#### 4.1.7 cl-string-match/src/boyer-moore-horspool.lisp

Dependency

boyer-moore.lisp (file)

Parent

src (module)

Location

src/boyer-moore-horspool.lisp

Exported Definitions
Internal Definitions

#### 4.1.8 cl-string-match/src/rabin-karp.lisp

Dependency
Parent

src (module)

Location

src/rabin-karp.lisp

Exported Definitions
Internal Definitions

#### 4.1.9 cl-string-match/src/knuth-morris-pratt.lisp

Dependency

rabin-karp.lisp (file)

Parent

src (module)

Location

src/knuth-morris-pratt.lisp

Exported Definitions
Internal Definitions

#### 4.1.10 cl-string-match/src/shift-or.lisp

Dependency

knuth-morris-pratt.lisp (file)

Parent

src (module)

Location

src/shift-or.lisp

Exported Definitions
Internal Definitions

#### 4.1.11 cl-string-match/src/aho-corasick.lisp

Dependency

shift-or.lisp (file)

Parent

src (module)

Location

src/aho-corasick.lisp

Exported Definitions
Internal Definitions

#### 4.1.12 cl-string-match/src/compiled-aho-corasick.lisp

Dependency

aho-corasick.lisp (file)

Parent

src (module)

Location

src/compiled-aho-corasick.lisp

Exported Definitions
Internal Definitions

make-lambda-ac (function)

#### 4.1.13 cl-string-match/src/suffix-tree.lisp

Dependency
Parent

src (module)

Location

src/suffix-tree.lisp

Exported Definitions
Internal Definitions

#### 4.1.14 cl-string-match/src/pre.lisp

Dependency

suffix-tree.lisp (file)

Parent

src (module)

Location

src/pre.lisp

Exported Definitions
Internal Definitions

#### 4.1.15 cl-string-match/src/doc.lisp

Dependency

pre.lisp (file)

Parent

src (module)

Location

src/doc.lisp

Exported Definitions
Internal Definitions

Previous: , Up: Lisp files   [Contents][Index]

#### 4.1.16 ascii-strings/contrib/ascii-strings.lisp

Parent

contrib (module)

Location

contrib/ascii-strings.lisp

Packages
Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

## 5 Packages

Packages are listed by definition order.

Next: , Previous: , Up: Packages   [Contents][Index]

### 5.1 clstringmatch.system

Source

cl-string-match.asd

Use List
• asdf/interface
• common-lisp

Next: , Previous: , Up: Packages   [Contents][Index]

### 5.2 cl-string-match

String matching functions and methods.

Source

package.lisp (file)

Nickname

sm

Use List
Exported Definitions
Internal Definitions

Next: , Previous: , Up: Packages   [Contents][Index]

### 5.3 ascii-strings.system

Source

ascii-strings.asd

Use List
• asdf/interface
• common-lisp

Previous: , Up: Packages   [Contents][Index]

### 5.4 ascii-strings

This library implements functions and data types
similar to the standard Common Lisp functions and types but prefixed with ub- to avoid naming conflicts.

This package aims at providing single-byte strings functionality for Unicode-enabled Common Lisp implementations. Another aim is to reduce memory footprint and boost performance of the string-processing algorithms.

There are similar libraries/packages with slight differences. Check, for instance, com.informatimago.common-lisp.cesarum.ascii.

Please note, that while ASCII uses 7-bits per character, this library works with octets, using 8-bits per character.

Source

ascii-strings.lisp (file)

Nickname

ascii

Use List
• alexandria.0.dev
• common-lisp
Used By List
Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

## 6 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.

Next: , Previous: , Up: Definitions   [Contents][Index]

### 6.1 Exported definitions

Next: , Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.1 Constants

Constant: +infinity+
Package
Source

suffix-tree.lisp (file)

Constant: ub-char-code-limit

Maximum number of UB-CHARs is limited by a single byte. That is: just 256 characters are possible.

Package
Source

ascii-strings.lisp (file)

Next: , Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.2 Special variables

Special Variable: @aho-corasick-section
Package
Source

aho-corasick.lisp (file)

Special Variable: @boyer-moore-horspool-section
Package
Source
Special Variable: @boyer-moore-section
Package
Source

boyer-moore.lisp (file)

Special Variable: @brute-force-section
Package
Source

brute-force.lisp (file)

Special Variable: @cl-string-match-manual
Package
Source

doc.lisp (file)

Special Variable: @knuth-morris-pratt-section
Package
Source

knuth-morris-pratt.lisp (file)

Special Variable: @multi-pattern-search
Package
Source

doc.lisp (file)

Special Variable: @pre-basic-matching-section
Package
Source

pre.lisp (file)

Special Variable: @pre-compiling-patterns-section
Package
Source

pre.lisp (file)

Special Variable: @pre-groups-section
Package
Source

pre.lisp (file)

Special Variable: @pre-pattern-replacing-section
Package
Source

pre.lisp (file)

Special Variable: @pre-pattern-scanning-section
Package
Source

pre.lisp (file)

Special Variable: @pre-pattern-splitting-section
Package
Source

pre.lisp (file)

Special Variable: @pre-regexp-section
Package
Source

pre.lisp (file)

Special Variable: @pre-with-re-match-section
Package
Source

pre.lisp (file)

Special Variable: @rabin-karp-section
Package
Source

rabin-karp.lisp (file)

Special Variable: @regexp-pattern-search
Package
Source

doc.lisp (file)

Special Variable: @shift-or-section
Package
Source

shift-or.lisp (file)

Special Variable: @single-pattern-search
Package
Source

doc.lisp (file)

Next: , Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.3 Macros

Macro: define-bmh-matcher VARIANT-TAG &key KEY-GET KEY-CODE KEY-CMP= EMPTY-PAT ALPHABET-SIZE DATA-TYPE
Package
Source
Macro: define-brute-matcher VARIANT-TAG &key KEY-GET KEY-CMP/= DATA-TYPE
Package
Source

brute-force.lisp (file)

Macro: with-re (RE PATTERN) &body BODY

Compile pattern if it’s not a RE object and execute body.

Package
Source

pre.lisp (file)

Macro: with-re-match (MATCH MATCH-EXPR &key NO-MATCH) &body BODY

Intern match symbols to execute a body.

Package
Source

pre.lisp (file)

Next: , Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.4 Functions

Function: ascii-char-code CHAR

This is like the standard CHAR-CODE but returns an octet.

Package
Source

ascii-strings.lisp (file)

Function: bmh8-p OBJECT
Package
Source
Function: bmh8-pat INSTANCE
Function: (setf bmh8-pat) VALUE INSTANCE
Package
Source
Function: build-suffix-tree-simple STR

Build a Suffix tree for the given string STR using naive (brute-force) algorithm.

Naive algorithm takes O(m²) time to build a suffix tree where m is the
given string length.

Essentially it operates as follows:

while suffices remain:
add next shortest suffix to the tree

Algorithm first ads a suffix Str[1..m] (entire string) to the
tree. Then it proceeds adding suffices Str[i..m] (i=2..m) to the tree.

Package
Source

suffix-tree.lisp (file)

Function: build-suffix-tree-ukkonen STR

Build a Suffix tree for the given string STR using Ukkonen’s algorithm.

Ukkonen’s algorithm takes O(m) time to build a suffix tree where m is the given string length.

Ukkonen’s algorithm is an on-line algorithm and can operate on a stream of characters, adding one character at a time.

Package
Source

suffix-tree.lisp (file)

Function: compile-re PATTERN

Create a regular expression from a pattern string.

Package
Source

pre.lisp (file)

Function: empty-trie ()

Creates a new instance and returns an empty trie.

Package
Source

aho-corasick.lisp (file)

Function: find-re PATTERN S &key START END OFFSET ALL

Find a regexp pattern match somewhere in a string. Run from an offset.

Package
Source

pre.lisp (file)

Function: initialize-ac ()

Returns a Trie that is used to look for the given patterns in the text. It can deal either with a single pattern or a list of patterns.

Package
Source

aho-corasick.lisp (file)

Function: initialize-bm ()
Package
Source

boyer-moore.lisp (file)

Function: initialize-bmh ()

Preprocess the needle.
Initialize the table to default value.

Package
Source
Function: initialize-bmh8 ()

Preprocess the needle.
Initialize the table to default value.

Package
Source
Function: initialize-kmp ()
Package
Source

knuth-morris-pratt.lisp (file)

Function: initialize-rk ()
Package
Source

rabin-karp.lisp (file)

Function: initialize-sor ()
Package
Source

shift-or.lisp (file)

Function: initialize-tabac ()

Returns a tabular FDA that is used to look for the given patterns in the text. It can deal either with a single pattern or a list of patterns.

Package
Source

aho-corasick.lisp (file)

Function: make-substitution-table SUBST
Package
Source

ascii-strings.lisp (file)

Function: make-suffix-node &key (START START) (END END) (PARENT PARENT) (CHILDREN CHILDREN) (ID ID)
Package
Source

suffix-tree.lisp (file)

Function: make-suffix-tree &key (STR STR) (ROOT ROOT) (NODES-COUNT NODES-COUNT)
Package
Source

suffix-tree.lisp (file)

Function: make-ub-buffer SIZE

Allocate an ub-buffer of the given size.

Package
Source

ascii-strings.lisp (file)

Function: make-ub-line-reader &key (STREAM STREAM) (BUFFER BUFFER) (FILL FILL) (POS POS) (EOF EOF)
Package
Source

ascii-strings.lisp (file)

Function: make-ub-string SIZE
Package
Source

ascii-strings.lisp (file)

Function: make-ukk-node &key (START START) (END END) (PARENT PARENT) (CHILDREN CHILDREN) (ID ID) (SUFFIX SUFFIX)
Package
Source

suffix-tree.lisp (file)

Function: match-re PATTERN S &key START END OFFSET EXACT

Test a pattern re against a string.

Package
Source

pre.lisp (file)

Function: octets-to-ub VEC

Convert a simple vector into an octets vector (ub-string).

Package
Source

ascii-strings.lisp (file)

Function: prefixed-with TXT PREF

Returns T if the given string ‘TXT‘ is prefixed (starts with) the given prefix ‘PREF‘.

Package
Source

utils.lisp (file)

Function: replace-re PATTERN WITH S &key START END OFFSET ALL

Replace patterns found within a string with a new value.

Package
Source

pre.lisp (file)

Function: search-ac ()

Looks for patterns that are indexed in the given trie and returns two values: start position of the first matching pattern and its index.

Package
Source

aho-corasick.lisp (file)

Function: search-bm ()

Search for pattern bm in txt.

Package
Source

boyer-moore.lisp (file)

Function: search-bmh ()

Search for pattern defined in the IDX in TXT.

Package
Source
Function: search-bmh8 ()

Search for pattern defined in the IDX in TXT.

Package
Source
Function: search-compiled-ac SEARCH-FUNCTION TXT
Package
Source
Function: search-kmp ()
Package
Source

knuth-morris-pratt.lisp (file)

Function: search-rk ()

Implementation of the Rabin-Karp substring search algorithm.

Package
Source

rabin-karp.lisp (file)

Function: search-sor ()
Package
Source

shift-or.lisp (file)

Function: search-tabac ()
Package
Source

aho-corasick.lisp (file)

Function: split-re PATTERN S &key START END OFFSET ALL COALESCE-SEPS

Split a string into one or more strings by pattern match.

Package
Source

pre.lisp (file)

Function: string-contains-ac ()

Looks for the given pattern in the text and returns index of the first occurence.

Package
Source

aho-corasick.lisp (file)

Function: string-contains-bm ()
Package
Source

boyer-moore.lisp (file)

Function: string-contains-bmh ()
Package
Source
Function: string-contains-bmh8 ()
Package
Source
Function: string-contains-brute ()

A Brute-force substring search implementation.

Brute-force substring search requires O(N x M) character compares to search for a pattern of length M in a text of length N, in the worst case.

Algorithm described in: Chapter 5, p. 760 in
’Algorithms’, Robert Sedgewick and Kevin Wayne. 4th

Package
Source

brute-force.lisp (file)

Function: string-contains-brute-ub ()

A Brute-force substring search implementation.

Brute-force substring search requires O(N x M) character compares to search for a pattern of length M in a text of length N, in the worst case.

Algorithm described in: Chapter 5, p. 760 in
’Algorithms’, Robert Sedgewick and Kevin Wayne. 4th

Package
Source

brute-force.lisp (file)

Function: string-contains-kmp ()
Package
Source

knuth-morris-pratt.lisp (file)

Function: string-contains-rk ()
Package
Source

rabin-karp.lisp (file)

Function: string-contains-sor ()
Package
Source

shift-or.lisp (file)

Function: string-contains-tabac ()

Looks for the given pattern in the text and returns index of the first occurence.

Package
Source

aho-corasick.lisp (file)

Function: string-to-ub STR

Convert a standard Lisp String into an octets vector.

Package
Source

ascii-strings.lisp (file)

Function: suffix-node-add-child TREE NODE START END

Adds a child node to the given NODE. Depending on the type of the given node creates either Ukkonens suffix tree or simple suffix tree node.

Package
Source

suffix-tree.lisp (file)

Function: suffix-node-children INSTANCE
Function: (setf suffix-node-children) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-node-end INSTANCE
Function: (setf suffix-node-end) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-node-equals NODE-A NODE-B
Package
Source

suffix-tree.lisp (file)

Function: suffix-node-leafp NODE

Retruns T is the given node is a leaf (has no children), return NIL otherwise.

Package
Source

suffix-tree.lisp (file)

Function: suffix-node-map-over-children NODE FUNCTION

Applies the given function to every child of the given node.

Package
Source

suffix-tree.lisp (file)

Function: suffix-node-start INSTANCE
Function: (setf suffix-node-start) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-node-str TREE NODE

Returns the string that corresponds to the edge pointing to this node.

Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-build-from-sexp STR SEXP

Build the suffix tree from the given sexp representation. Sexp is in form (begin end (children)).

Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-char TREE I
Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-equals TREE-A TREE-B

Checks all the nodes in the given trees and returns T if trees are equal or NIL otherwise.

Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-root INSTANCE
Function: (setf suffix-tree-root) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-str INSTANCE
Function: (setf suffix-tree-str) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-walk TREE FUNCTION

Applies the given FUNCTION to every node in the given TREE.

Package
Source

suffix-tree.lisp (file)

Function: suffixed-with TXT SUFF

Returns T if the given string ‘TXT‘ is suffixed (ends with) the given suffix ‘SUFF‘.

Package
Source

utils.lisp (file)

Function: trie->compiled-ac TRIE

Returns a compiled function for the given Aho-Corasick trie.

Package
Source
Function: trie->tabular-ac TRIE

Given an Aho-Corasick trie and transforms it into a table-based DFA for the Aho-Corasick automata.

Package
Source

aho-corasick.lisp (file)

Starting at the root, follow the path labeled by chars of Pi

If the path ends before Pi, continue it by adding new edges and nodes for the remaining characters of Pi.

Store identifier i of Pi at the terminal node of the path.

Package
Source

aho-corasick.lisp (file)

Function: trie-build ()

Builds a Trie based on the given list of patterns.

Package
Source

aho-corasick.lisp (file)

Function: trie-contains ()

Returns T if the given TRIE contains the given string S.

Package
Source

aho-corasick.lisp (file)

Function: trie-contains-prefix ()

Checks if the given TRIE contains some prefix of the given string S.

Returns the length of the matched prefix.

Package
Source

aho-corasick.lisp (file)

Function: trie-pprint TRIE &key PADDING ROOT STREAM

Traverse the given trie and pretty-print it to the given stream.

ROOT is the root node (optional).

Package
Source

aho-corasick.lisp (file)

Function: trie-traverse-bfo TRIE HANDLER

Traverse the trie in the Breadth-First-Order and call the given handler function on each node..

Package
Source

aho-corasick.lisp (file)

Function: trie-traverse-dfo TRIE HANDLER

Traverse the trie in the Depth-First-Order and call the given handler function on each node.

Package
Source

aho-corasick.lisp (file)

Function: ub-alpha-char-p C
Package
Source

ascii-strings.lisp (file)

Function: ub-alphanumericp C

Returns true if character is an alphabetic character or a numeric character; otherwise, returns false.

Package
Source

ascii-strings.lisp (file)

Function: ub-both-case-p ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char STRING INDEX
Package
Source

ascii-strings.lisp (file)

Function: ub-char-code CHAR

Returns a code of the ub-char. Since ub-char is a number (an octet), it is essentially an identity function.

Package
Source

ascii-strings.lisp (file)

Function: ub-char-downcase C
Package
Source

ascii-strings.lisp (file)

Function: ub-char-equal ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char-greaterp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char-int CHAR
Package
Source

ascii-strings.lisp (file)

Function: ub-char-lessp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char-name ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char-not-equal ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char-not-greaterp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char-not-lessp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-char-upcase C
Package
Source

ascii-strings.lisp (file)

Function: ub-char/= C1 C2
Package
Source

ascii-strings.lisp (file)

Function: ub-char< C1 C2
Package
Source

ascii-strings.lisp (file)

Function: ub-char<= C1 C2
Package
Source

ascii-strings.lisp (file)

Function: ub-char= C1 C2
Package
Source

ascii-strings.lisp (file)

Function: ub-char> C1 C2
Package
Source

ascii-strings.lisp (file)

Function: ub-char>= C1 C2
Package
Source

ascii-strings.lisp (file)

Function: ub-code-char ()
Package
Source

ascii-strings.lisp (file)

Function: ub-digit-char ()
Package
Source

ascii-strings.lisp (file)

Function: ub-digit-char-p C &optional R

Tests whether char is a digit in the specified radix (i.e., with a weight less than radix). If it is a digit in that radix, its weight is returned as an integer; otherwise nil is returned.

Package
Source

ascii-strings.lisp (file)

Function: ub-graphic-char-p ()
Package
Source

ascii-strings.lisp (file)

Mimics a standard close function: closes the underlying stream and resets the reader to its initial state.

Package
Source

ascii-strings.lisp (file)

Returns the current position within the stream according to the amount of information really read.

When the buffer caches more information than was really read by one of UB-READ-LINE function the standard FILE-POSITION function will return position of the buffer that is larger than the position that was read by the user.

Returned number can be used by the standard FILE-POSITION function to adjust the position within a stream.

When optional argument POSITION is supplied, the file position is adjusted accordingly in the underlying stream. The buffer is flushed.

Package
Source

ascii-strings.lisp (file)

Function: ub-lower-case-p C
Package
Source

ascii-strings.lisp (file)

Function: ub-name-char ()
Package
Source

ascii-strings.lisp (file)

Function: ub-nstring-capitalize ()
Package
Source

ascii-strings.lisp (file)

Function: ub-nstring-downcase ()
Package
Source

ascii-strings.lisp (file)

Function: ub-nstring-upcase ()
Package
Source

ascii-strings.lisp (file)

Reads data into a pre-allocated buffer in the reader structure and returns an array displaced to the contents of the next line in that buffer.

Package
Source

ascii-strings.lisp (file)

Reads data into the pre-allocated buffer in the READER structure and returns two values: start and end positions of the line within the buffer that can be used to extract this line contents from the buffer.

Please note, that unlike the standard read-line or the liberal-read-line by jasonmelbye this function works with the Unix-type of lines - sequence of characters delimited by the Newline symbol.

Package
Source

ascii-strings.lisp (file)

Reads data into a pre-allocated buffer in the reader structure and returns a standard Lisp string.

Package
Source

ascii-strings.lisp (file)

Function: ub-standard-char-p ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-capitalize ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-downcase ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-equal ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-greaterp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-left-trim ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-lessp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-not-equal ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-not-greaterp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-not-lessp ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-right-trim ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-trim ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string-upcase ()
Package
Source

ascii-strings.lisp (file)

Function: ub-string/= STRING1 STRING2 &key START1 END1 START2 END2

Returns true if the supplied substrings are different; otherwise it is false.

Package
Source

ascii-strings.lisp (file)

Function: ub-string< STRING1 STRING2 &key START1 END1 START2 END2

Returns true if substring1 is less than substring2; otherwise it is false.

Package
Source

ascii-strings.lisp (file)

Function: ub-string<= STRING1 STRING2 &key START1 END1 START2 END2

Returns true if substring1 is less than or equal to substring2; otherwise it is false.

Package
Source

ascii-strings.lisp (file)

Function: ub-string= STRING1 STRING2 &key START1 END1 START2 END2

Returns true if the supplied substrings are of the same length and contain the same characters in corresponding positions; otherwise it returns false.

Package
Source

ascii-strings.lisp (file)

Function: ub-string> STRING1 STRING2 &key START1 END1 START2 END2

Returns true if substring1 is greater than substring2; otherwise it is false.

Package
Source

ascii-strings.lisp (file)

Function: ub-string>= STRING1 STRING2 &key START1 END1 START2 END2

Returns true if substring1 is greater than or equal to substring2; otherwise it is false.

Package
Source

ascii-strings.lisp (file)

Function: ub-subseq SEQUENCE START &optional END

Returns either the SEQUENCE itself, or a dispaced array to the SEQUENCE. This is meant as a memory-efficient replacement for the ordinary SUBSEQ that allocates a new sequence.

Package
Source

ascii-strings.lisp (file)

Function: ub-to-string USTR &key START END SUBST

Converts either an UB-STRING or UB-BUFFER into a standard Common Lisp string.

START, END the start and end offsets within the given USTR to translate into a standard string.

Package
Source

ascii-strings.lisp (file)

Function: ub-upper-case-p C
Package
Source

ascii-strings.lisp (file)

Function: ukk-node-suffix INSTANCE
Function: (setf ukk-node-suffix) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Next: , Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.5 Generic functions

Generic Function: match-groups OBJECT
Package
Methods
Method: match-groups (RE-MATCH re-match)

Source

pre.lisp (file)

Generic Function: match-pos-end OBJECT
Package
Methods
Method: match-pos-end (RE-MATCH re-match)

Source

pre.lisp (file)

Generic Function: match-pos-start OBJECT
Package
Methods
Method: match-pos-start (RE-MATCH re-match)

Source

pre.lisp (file)

Generic Function: match-string OBJECT
Package
Methods
Method: match-string (RE-MATCH re-match)

Source

pre.lisp (file)

Next: , Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.6 Structures

Structure: bmh8 ()
Package
Source
Direct superclasses

structure-object (structure)

Direct slots
Type

(simple-array fixnum (*))

Initform

#()

Writers

Slot: pat
Type

simple-string

Initform

""

bmh8-pat (function)

Writers

(setf bmh8-pat) (function)

Slot: pat-len
Type

fixnum

Initform

0

bmh8-pat-len (function)

Writers

(setf bmh8-pat-len) (function)

Structure: suffix-node ()

Documentation

Package
Source

suffix-tree.lisp (file)

Direct superclasses

structure-object (structure)

Direct subclasses

ukk-node (structure)

Direct methods

print-object (method)

Direct slots
Slot: start
Type

fixnum

Initform

0

suffix-node-start (function)

Writers

(setf suffix-node-start) (function)

Slot: end
Type

fixnum

Initform

0

suffix-node-end (function)

Writers

(setf suffix-node-end) (function)

Slot: parent

suffix-node-parent (function)

Writers

(setf suffix-node-parent) (function)

Slot: children

suffix-node-children (function)

Writers

(setf suffix-node-children) (function)

Slot: id
Type

fixnum

Initform

0

suffix-node-id (function)

Writers

(setf suffix-node-id) (function)

Structure: suffix-tree ()

ROOT is a SUFFIX-NODE with default values. It is intended to keep references to its children. ROOT is just a simple placeholder for the actual nodes as its children.

Package
Source

suffix-tree.lisp (file)

Direct superclasses

structure-object (structure)

Direct methods

print-object (method)

Direct slots
Slot: str
Type

simple-string

Initform

""

suffix-tree-str (function)

Writers

(setf suffix-tree-str) (function)

Slot: root

suffix-tree-root (function)

Writers

(setf suffix-tree-root) (function)

Slot: nodes-count
Type

fixnum

Initform

0

suffix-tree-nodes-count (function)

Writers

(setf suffix-tree-nodes-count) (function)

Structure: trie-node ()

Each node of a trie contains a list of child nodes, a label (the letter) and a mark (some value attributed to the matching string).

Trie root node is like all other nodes but its ‘ID‘ is used as an increment to create ids for new nodes.

Slots:

* ‘id‘ - unique node identifier, root node has the largest id.
* ‘children‘ - a hash table with labels as keys, ‘trie-node‘ as values.
* ‘mark‘ - output function, when not null marks the last character of a keyword and is returned as the search result.
* ‘fail‘ - fail transition to another node.
* ‘depth‘ - number of nodes from the root node to this node.

Package
Source

aho-corasick.lisp (file)

Direct superclasses

structure-object (structure)

Direct methods

print-object (method)

Direct slots
Slot: id
Type

fixnum

Initform

0

trie-node-id (function)

Writers

(setf trie-node-id) (function)

Slot: children

trie-node-children (function)

Writers

(setf trie-node-children) (function)

Slot: label

trie-node-label (function)

Writers

(setf trie-node-label) (function)

Slot: mark

trie-node-mark (function)

Writers

(setf trie-node-mark) (function)

Slot: fail
Type

(or null cl-string-match:trie-node)

trie-node-fail (function)

Writers

(setf trie-node-fail) (function)

Slot: depth
Type

fixnum

Initform

0

trie-node-depth (function)

Writers

(setf trie-node-depth) (function)

Structure: ukk-node ()

Ukkonen’s algorithm relies on the suffix link technique. Some other algorithms might also rely on it but the naive suffix tree algorithm does not require it.

This structure extends the standard suffix tree node structure with the suffix link slot.

Package
Source

suffix-tree.lisp (file)

Direct superclasses

suffix-node (structure)

Direct slots
Slot: suffix

ukk-node-suffix (function)

Writers

(setf ukk-node-suffix) (function)

Next: , Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.7 Classes

Class: re ()

Regular expression.

Package
Source

pre.lisp (file)

Direct superclasses

standard-object (class)

Direct methods
Direct slots
Slot: pattern
Initargs

:pattern

re-pattern (generic function)

Slot: expr
Initargs

:expression

re-expression (generic function)

Class: re-match ()

Matched pattern.

Package
Source

pre.lisp (file)

Direct superclasses

standard-object (class)

Direct methods
Direct slots
Slot: match
Initargs

:match

match-string (generic function)

Slot: groups
Initargs

:groups

match-groups (generic function)

Slot: start-pos
Initargs

:start-pos

match-pos-start (generic function)

Slot: end-pos
Initargs

:end-pos

match-pos-end (generic function)

Previous: , Up: Exported definitions   [Contents][Index]

#### 6.1.8 Types

Type: ub-buffer ()

Buffer of octets is a simple array to allow compiler optimizations.

Package
Source

ascii-strings.lisp (file)

Type: ub-char ()

An octet. A number equal to the char code.

Package
Source

ascii-strings.lisp (file)

Type: ub-string ()

Strings must be vectors to allow for fill pointers and displacement. However, SBCL does a better job on optimizations when it deals with simple arrays.

Package
Source

ascii-strings.lisp (file)

Previous: , Up: Definitions   [Contents][Index]

### 6.2 Internal definitions

Next: , Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.1 Constants

Constant: +ac-alphabet+
Package
Source

aho-corasick.lisp (file)

Constant: +alph-size+
Package
Source

rabin-karp.lisp (file)

Constant: +big-prime+
Package
Source

rabin-karp.lisp (file)

Constant: +newline+
Package
Source

ascii-strings.lisp (file)

Constant: +sor-alphabet+

Size of the alphabet for the Shift-OR algorithm implementation.

Package
Source

shift-or.lisp (file)

Defines the size of the buffer used by the line reader in bytes.

Package
Source

ascii-strings.lisp (file)

Next: , Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.2 Special variables

Special Variable: *default-substitution-table*

Substitution table used to convert ub-strings to the standard strings by default. Since character with code 0 is not very welcome in the world of C, we are converting it to an ordinary character.

Package
Source

ascii-strings.lisp (file)

Special Variable: *expression-parser*
Package
Source

pre.lisp (file)

Special Variable: *set-parser*
Package
Source

pre.lisp (file)

Special Variable: *standard-debug-settings*

The standard debug settings to be used in functions under development.

Package
Source

package.lisp (file)

Special Variable: *standard-optimize-settings*

The standard optimize settings used by most declaration expressions. Tuned for best performance by default, but when the SM-DEBUG-ENABLED keyword is present in the *FEATURES* list, makes debug and safety its priority at the expense of everything else.

Package
Source

package.lisp (file)

Next: , Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.3 Macros

Macro: map-trie-children (PARENT CHILD) &body BODY

Perform the given BODY on the children of the given PARENT. The child node is bound to the CHILD variable.

Example:

(map-trie-children (node child)
(push child stack))

Package
Source

aho-corasick.lisp (file)

Macro: ub-string<>=*-body LESSP EQUALP

Makes a body of the string comparison functions. Borrowed from the string.lisp file of the SBCL sources.

Package
Source

ascii-strings.lisp (file)

Next: , Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.4 Functions

Function: %ub-string-compare STRING1 START1 END1 STRING2 START2 END2
Package
Source

ascii-strings.lisp (file)

Adds a suffix of a string STR designated by its START and END to the suffix tree TREE.

Package
Source

suffix-tree.lisp (file)

Package
Source

boyer-moore.lisp (file)

Function: bm-good-suffix INSTANCE
Function: (setf bm-good-suffix) VALUE INSTANCE
Package
Source

boyer-moore.lisp (file)

Function: bm-p OBJECT
Package
Source

boyer-moore.lisp (file)

Function: bm-pat INSTANCE
Function: (setf bm-pat) VALUE INSTANCE
Package
Source

boyer-moore.lisp (file)

Function: bm-suffixes PAT SUFFIXES
Package
Source

boyer-moore.lisp (file)

Package
Source
Function: bmh-p OBJECT
Package
Source
Function: bmh-pat INSTANCE
Function: (setf bmh-pat) VALUE INSTANCE
Package
Source
Function: bmh-pat-len INSTANCE
Function: (setf bmh-pat-len) VALUE INSTANCE
Package
Source
Package
Source
Function: bmh8-pat-len INSTANCE
Function: (setf bmh8-pat-len) VALUE INSTANCE
Package
Source
Function: branch TREE NODE L R C

Function branch is used to test if a position is the end point and turn the implicit node to explicit node if necessary. Because sentinel node is not used, the special case is handled in the first if-clause.

Package
Source

suffix-tree.lisp (file)

Function: build-suffix-tree-mccreight STR

Build a Suffix tree for the given string STR using McCreight’s algorithm.

McCreight’s algorithm takes O(n) time to build a suffix tree where n is the given string length.

TODO

Package
Source

suffix-tree.lisp (file)

Function: canonize TREE NODE L R
Package
Source

suffix-tree.lisp (file)

Function: check-rk-lv ()

Las Vegas version: does pat[] match txt[i..i-M+1] ?

Package
Source

rabin-karp.lisp (file)

Function: check-rk-mk I

Monte Carlo version: always return true

Package
Source

rabin-karp.lisp (file)

Function: compile-set S

Create a single satisfy predicate for a character set.

Package
Source

pre.lisp (file)

Function: compute-failure-function ()

Given a trie calculate failure transitions for its nodes.

Modifies nodes.

Traverses the trie in the breadth-first-order (BFO)

Package
Source

aho-corasick.lisp (file)

Function: copy-bm INSTANCE
Package
Source

boyer-moore.lisp (file)

Function: copy-bmh INSTANCE
Package
Source
Function: copy-bmh8 INSTANCE
Package
Source
Function: copy-kmp INSTANCE
Package
Source

knuth-morris-pratt.lisp (file)

Package
Source

pre.lisp (file)

Function: copy-rk INSTANCE
Package
Source

rabin-karp.lisp (file)

Function: copy-sor INSTANCE
Package
Source

shift-or.lisp (file)

Function: copy-suffix-node INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: copy-suffix-tree INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: copy-tabac INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: copy-trie-node INSTANCE
Package
Source

aho-corasick.lisp (file)

Package
Source

ascii-strings.lisp (file)

Function: copy-ukk-node INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: escape C

Return the test and predicate for an escaped character.

Package
Source

pre.lisp (file)

Function: format-name FORMAT &rest ARGS
Package
Source

package.lisp (file)

Function: hash-bm ORDER
Package
Source

boyer-moore.lisp (file)

Function: hex-char-p C

T if c is a hexadecimal character.

Package
Source

pre.lisp (file)

Function: horner-hash ()

Horner hashing function implementation.

Computes the hash function for an END-digit base- +ALPH-SIZE+ number represented as a char array in time proportional to END. (We pass END as an argument so that we can use the function for both the pattern and the text.)

Package
Source

rabin-karp.lisp (file)

Initialize the bad character skip for the Boyer-Moore algorithm.

Package
Source

boyer-moore.lisp (file)

Function: initialize-good-suffix IDX

Initialize the good suffix skip function table for the Boyer-Moore algorithm.

Package
Source

boyer-moore.lisp (file)

Function: kmp-p OBJECT
Package
Source

knuth-morris-pratt.lisp (file)

Function: kmp-pat INSTANCE
Function: (setf kmp-pat) VALUE INSTANCE
Package
Source

knuth-morris-pratt.lisp (file)

Function: kmp-pat-len INSTANCE
Function: (setf kmp-pat-len) VALUE INSTANCE
Package
Source

knuth-morris-pratt.lisp (file)

Function: kmp-table INSTANCE
Function: (setf kmp-table) VALUE INSTANCE
Package
Source

knuth-morris-pratt.lisp (file)

Package
Source

boyer-moore.lisp (file)

Package
Source
Package
Source
Function: make-kmp &key (PAT PAT) (PAT-LEN PAT-LEN) (TABLE TABLE)
Package
Source

knuth-morris-pratt.lisp (file)

Function: make-lambda-ac TRIE

Writes lambda function performing search over the given trie.

That function can later be compiled into native code and funcall-ed. The generated function accepts a single string and returns matching mark from the given trie.

Generated function is essentially a giant tagbody with jumps dispatched in case forms depending on the current character.

Package
Source
Function: make-re-thread PC SP GROUPS STACK
Package
Source

pre.lisp (file)

Function: make-rk &key (PAT PAT) (PAT-HASH PAT-HASH) (PAT-LEN PAT-LEN) (ALPH-SIZE ALPH-SIZE) (RM RM)
Package
Source

rabin-karp.lisp (file)

Function: make-sor &key (CIDX CIDX) (LIM LIM) (PAT-LEN PAT-LEN)
Package
Source

shift-or.lisp (file)

Function: make-tabac &key (START START) (TRANS TRANS) (FINAL FINAL) (MATCH-LEN MATCH-LEN)
Package
Source

aho-corasick.lisp (file)

Function: make-trie-node &key (ID ID) (CHILDREN CHILDREN) (LABEL LABEL) (MARK MARK) (FAIL FAIL) (DEPTH DEPTH)
Package
Source

aho-corasick.lisp (file)

Function: newline-p C

T if c is a newline character.

Package
Source

pre.lisp (file)

Function: print-markdown-footer STREAM
Package
Source

doc.lisp (file)

Function: print-suffix-tree-for-gv TREE &key STREAM LABEL

Print the given suffix TREE for the Graphviz visualization in the dot format.

Package
Source

suffix-tree.lisp (file)

Function: print-suffix-tree-for-gv-pretty TREE &key STREAM LABEL

Alternative representation of the suffix tree. Based on java implementation from: http://pastie.org/5925812

Package
Source

suffix-tree.lisp (file)

Function: print-suffix-tree-in-file TREE FNAME &optional LABEL
Package
Source

suffix-tree.lisp (file)

Function: punctuation-p C

T if c is a punctuation character.

Package
Source

pre.lisp (file)

Package
Source

pre.lisp (file)

Package
Source

pre.lisp (file)

Package
Source

pre.lisp (file)

Package
Source

pre.lisp (file)

Package
Source

pre.lisp (file)

Function: ref-length L R
Package
Source

suffix-tree.lisp (file)

Function: rk-alph-size INSTANCE
Function: (setf rk-alph-size) VALUE INSTANCE
Package
Source

rabin-karp.lisp (file)

Function: rk-p OBJECT
Package
Source

rabin-karp.lisp (file)

Function: rk-pat INSTANCE
Function: (setf rk-pat) VALUE INSTANCE
Package
Source

rabin-karp.lisp (file)

Function: rk-pat-hash INSTANCE
Function: (setf rk-pat-hash) VALUE INSTANCE
Package
Source

rabin-karp.lisp (file)

Function: rk-pat-len INSTANCE
Function: (setf rk-pat-len) VALUE INSTANCE
Package
Source

rabin-karp.lisp (file)

Function: rk-rm INSTANCE
Function: (setf rk-rm) VALUE INSTANCE
Package
Source

rabin-karp.lisp (file)

Function: run ()

Execute a regular expression program.

Package
Source

pre.lisp (file)

Function: sor-cidx INSTANCE
Function: (setf sor-cidx) VALUE INSTANCE
Package
Source

shift-or.lisp (file)

Function: sor-lim INSTANCE
Function: (setf sor-lim) VALUE INSTANCE
Package
Source

shift-or.lisp (file)

Function: sor-p OBJECT
Package
Source

shift-or.lisp (file)

Function: sor-pat-len INSTANCE
Function: (setf sor-pat-len) VALUE INSTANCE
Package
Source

shift-or.lisp (file)

Function: sor-printer OBJ STREAM DEPTH

Make dump of the SOR structure more human-readable: in the CIDX print characters and their positions, decipher LIM.

Package
Source

shift-or.lisp (file)

Function: space-p C

T if c is a whitespace character.

Package
Source

pre.lisp (file)

Function: split-node TREE OLD-NODE START END POS

Split the given NODE at position POS within the node.

Package
Source

suffix-tree.lisp (file)

Function: stbstr STR L R

The +infinity+ is defined as the largest number (standard constant MOST-POSITIVE-FIXNUM), if the right index exceed to the length of the list, the result is from left to the end of the list.

Package
Source

suffix-tree.lisp (file)

Function: str-length NODE

Length of the substring in the arc.

Package
Source

suffix-tree.lisp (file)

Function: suffix-node-get-child TREE NODE C

Given the NODE lookup a child node that starts with the given char C in the given suffix tree TREE.

Package
Source

suffix-tree.lisp (file)

Function: suffix-node-id INSTANCE
Function: (setf suffix-node-id) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-node-insert-child TREE NODE CHILD-NODE

Adds a child node to the given NODE.

Package
Source

suffix-tree.lisp (file)

Function: suffix-node-p OBJECT
Package
Source

suffix-tree.lisp (file)

Function: suffix-node-parent INSTANCE
Function: (setf suffix-node-parent) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-node-printer OBJ STREAM DEPTH
Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-nodes-count INSTANCE
Function: (setf suffix-tree-nodes-count) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-p OBJECT
Package
Source

suffix-tree.lisp (file)

Function: suffix-tree-printer OBJ STREAM DEPTH
Package
Source

suffix-tree.lisp (file)

Function: tab-p C

T if c is a tab character.

Package
Source

pre.lisp (file)

Function: tabac-final INSTANCE
Function: (setf tabac-final) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: tabac-match-len INSTANCE
Function: (setf tabac-match-len) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: tabac-p OBJECT
Package
Source

aho-corasick.lisp (file)

Function: tabac-start INSTANCE
Function: (setf tabac-start) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: tabac-trans INSTANCE
Function: (setf tabac-trans) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: tabac-transition IDX STATE C

Returns a new state based on the given char from the current state.

Package
Source

aho-corasick.lisp (file)

Add a child node to the given node with the given label and mark.

Constructor can be either ‘MAKE-TRIE-NODE‘ or any other structure constructor derieved from the ‘TRIE-NODE‘ struct.

Package
Source

aho-corasick.lisp (file)

Function: trie-dump-dot TRIE &key STREAM

Dumps a textual representation of the Trie useful for making a graphical plot with Graphviz to the given stream.

Package
Source

aho-corasick.lisp (file)

Function: trie-dump-dot-file TRIE FNAME

Dumps output of the TRIE-DUMP-DOT function to the file with the specified file name.

Package
Source

aho-corasick.lisp (file)

Function: trie-find-child ()

Looks for a child of the given node trie with the given label.

Package
Source

aho-corasick.lisp (file)

Function: trie-node-children INSTANCE
Function: (setf trie-node-children) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: trie-node-depth INSTANCE
Function: (setf trie-node-depth) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: trie-node-fail INSTANCE
Function: (setf trie-node-fail) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: trie-node-id INSTANCE
Function: (setf trie-node-id) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: trie-node-label INSTANCE
Function: (setf trie-node-label) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: trie-node-mark INSTANCE
Function: (setf trie-node-mark) VALUE INSTANCE
Package
Source

aho-corasick.lisp (file)

Function: trie-node-p OBJECT
Package
Source

aho-corasick.lisp (file)

Function: trie-node-printer OBJ STREAM DEPTH

We have to avoid standard Lisp printer because of the FAIL links that turn our tree into a network with cycles, throwing the default printer into an infinite loop.

Package
Source

aho-corasick.lisp (file)

Package
Source

ascii-strings.lisp (file)

Package
Source

ascii-strings.lisp (file)

Package
Source

ascii-strings.lisp (file)

Package
Source

ascii-strings.lisp (file)

Package
Source

ascii-strings.lisp (file)

Package
Source

ascii-strings.lisp (file)

Function: ub-schar STRING INDEX
Package
Source

ascii-strings.lisp (file)

Function: ukk-node-children INSTANCE
Function: (setf ukk-node-children) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: ukk-node-end INSTANCE
Function: (setf ukk-node-end) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: ukk-node-id INSTANCE
Function: (setf ukk-node-id) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: ukk-node-p OBJECT
Package
Source

suffix-tree.lisp (file)

Function: ukk-node-parent INSTANCE
Function: (setf ukk-node-parent) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: ukk-node-start INSTANCE
Function: (setf ukk-node-start) VALUE INSTANCE
Package
Source

suffix-tree.lisp (file)

Function: ukkonen-update TREE NODE L I
Package
Source

suffix-tree.lisp (file)

Function: update-md ()
Package
Source

doc.lisp (file)

Function: word-char-p C

T if is alphanumeric or an underscore.

Package
Source

pre.lisp (file)

Next: , Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.5 Generic functions

Generic Function: re-expression OBJECT
Package
Methods
Method: re-expression (RE re)

Source

pre.lisp (file)

Generic Function: re-pattern OBJECT
Package
Methods
Method: re-pattern (RE re)

Source

pre.lisp (file)

Next: , Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.6 Structures

Structure: bm ()
Package
Source

boyer-moore.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Type

(simple-array fixnum)

Initform

(make-array 0 :element-type (quote fixnum))

Writers

Slot: good-suffix
Type

(simple-array fixnum)

Initform

(make-array 0 :element-type (quote fixnum))

bm-good-suffix (function)

Writers

(setf bm-good-suffix) (function)

Slot: pat
Type

simple-string

Initform

""

bm-pat (function)

Writers

(setf bm-pat) (function)

Structure: bmh ()
Package
Source
Direct superclasses

structure-object (structure)

Direct slots
Type

(simple-array fixnum (*))

Initform

#()

Writers

Slot: pat
Type

simple-string

Initform

""

bmh-pat (function)

Writers

(setf bmh-pat) (function)

Slot: pat-len
Type

fixnum

Initform

0

bmh-pat-len (function)

Writers

(setf bmh-pat-len) (function)

Structure: kmp ()
Package
Source

knuth-morris-pratt.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Slot: pat
Type

simple-string

Initform

""

kmp-pat (function)

Writers

(setf kmp-pat) (function)

Slot: pat-len
Type

fixnum

Initform

0

kmp-pat-len (function)

Writers

(setf kmp-pat-len) (function)

Slot: table
Type

(simple-array fixnum)

Initform

(make-array 0 :element-type (quote fixnum))

kmp-table (function)

Writers

(setf kmp-table) (function)

Package
Source

pre.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Slot: pc

Writers

Slot: sp

Writers

Slot: groups

Writers

Slot: stack

Writers

Structure: rk ()
Package
Source

rabin-karp.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Slot: pat
Type

simple-string

Initform

""

rk-pat (function)

Writers

(setf rk-pat) (function)

Slot: pat-hash
Type

cl-string-match::rk-ub32

Initform

0

rk-pat-hash (function)

Writers

(setf rk-pat-hash) (function)

Slot: pat-len
Type

cl-string-match::rk-ub32

Initform

0

rk-pat-len (function)

Writers

(setf rk-pat-len) (function)

Slot: alph-size
Type

cl-string-match::rk-ub32

Initform

cl-string-match::+alph-size+

rk-alph-size (function)

Writers

(setf rk-alph-size) (function)

Slot: rm
Type

cl-string-match::rk-ub32

Initform

1

rk-rm (function)

Writers

(setf rk-rm) (function)

Structure: sor ()

Index for the Shift-OR search operation.

Package
Source

shift-or.lisp (file)

Direct superclasses

structure-object (structure)

Direct methods

print-object (method)

Direct slots
Slot: cidx

sor-cidx (function)

Writers

(setf sor-cidx) (function)

Slot: lim
Type

cl-string-match::sor-ub32

Initform

0

sor-lim (function)

Writers

(setf sor-lim) (function)

Slot: pat-len
Type

fixnum

Initform

0

sor-pat-len (function)

Writers

(setf sor-pat-len) (function)

Structure: tabac ()

Defines an Aho-Corasick DFA based on tables.

Slots:

* ‘start‘ - initial state
* ‘trans‘ - a table of transition tables
* ‘final‘ - marks final states
* ‘match-len‘ - stores length of the keyword corresponding to the final state

Package
Source

aho-corasick.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Slot: start
Type

fixnum

Initform

-1

tabac-start (function)

Writers

(setf tabac-start) (function)

Slot: trans
Type

(or null (simple-array (or null (simple-array fixnum))))

tabac-trans (function)

Writers

(setf tabac-trans) (function)

Slot: final
Type

(simple-array fixnum)

Initform

(make-array 0 :element-type (quote fixnum))

tabac-final (function)

Writers

(setf tabac-final) (function)

Slot: match-len
Type

(simple-array fixnum)

Initform

(make-array 0 :element-type (quote fixnum))

tabac-match-len (function)

Writers

(setf tabac-match-len) (function)

Encapsulates a buffer and the state of the line reader. Created by the make-ub-line-reader function, and the every next line is obtained by one of the ub-read-line-* functions.

You should not forget to close the underlying stream using either a standard close function or the ub-line-reader-close function.

It is anticipated that the underlying stream is an input stream with element-type set to ub-char.

Package
Source

ascii-strings.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Slot: stream
Type

(or null stream)

Writers

Slot: buffer
Type

ascii-strings:ub-buffer

Initform

Writers

Slot: fill
Type

fixnum

Initform

0

Writers

Slot: pos
Type

fixnum

Initform

0

Writers

Slot: eof
Type

boolean

Writers

Previous: , Up: Internal definitions   [Contents][Index]

#### 6.2.7 Types

Type: rk-ub32 ()
Package
Source

rabin-karp.lisp (file)

Type: sor-ub32 ()
Package
Source

shift-or.lisp (file)

Previous: , Up: Top   [Contents][Index]

## Appendix A Indexes

Next: , Previous: , Up: Indexes   [Contents][Index]

### A.1 Concepts

Next: , Previous: , Up: Indexes   [Contents][Index]

### A.2 Functions

Jump to: %   (   A   B   C   D   E   F   G   H   I   K   M   N   O   P   R   S   T   U   W
Jump to: %   (   A   B   C   D   E   F   G   H   I   K   M   N   O   P   R   S   T   U   W

Next: , Previous: , Up: Indexes   [Contents][Index]

### A.3 Variables

Jump to: *   +   @   A   B   C   D   E   F   G   I   L   M   N   P   R   S   T   U
Jump to: *   +   @   A   B   C   D   E   F   G   I   L   M   N   P   R   S   T   U

Previous: , Up: Indexes   [Contents][Index]