This is the cl-string-match Reference Manual, version 2018.1.2, generated automatically by Declt version 4.0 beta 2 "William Riker" on Thu Aug 15 04:24:27 2024 GMT+0.
cl-string-match/cl-string-match.asd
ascii-strings/ascii-strings.asd
cl-string-match/src/package.lisp
cl-string-match/src/utils.lisp
cl-string-match/src/brute-force.lisp
cl-string-match/src/boyer-moore.lisp
cl-string-match/src/boyer-moore-horspool.lisp
cl-string-match/src/rabin-karp.lisp
cl-string-match/src/knuth-morris-pratt.lisp
cl-string-match/src/shift-or.lisp
cl-string-match/src/aho-corasick.lisp
cl-string-match/src/compiled-aho-corasick.lisp
cl-string-match/src/suffix-tree.lisp
cl-string-match/src/pre.lisp
cl-string-match/src/doc.lisp
ascii-strings/contrib/ascii-strings.lisp
The main system appears first, followed by any subsystem dependency.
cl-string-match
Provides implementations of the standard sub-string search (string matching) algorithms: brute-force, Boyer-Moore, Rabin-Karp, etc.
Vityok https://bitbucket.org/vityok
BSD
2018.1.2
alexandria
(system).
ascii-strings
(system).
yacc
(system).
jpl-queues
(system).
iterate
(system).
mgl-pax/document
(system).
src
(module).
ascii-strings
Operations on ASCII strings. Essentially this can be any kind of single-byte encoded strings.
Vityok https://bitbucket.org/vityok
BSD
2015.9.16
alexandria
(system).
babel
(system).
contrib
(module).
Modules are listed depth-first from the system components tree.
cl-string-match/src
cl-string-match
(system).
package.lisp
(file).
utils.lisp
(file).
brute-force.lisp
(file).
boyer-moore.lisp
(file).
boyer-moore-horspool.lisp
(file).
rabin-karp.lisp
(file).
knuth-morris-pratt.lisp
(file).
shift-or.lisp
(file).
aho-corasick.lisp
(file).
compiled-aho-corasick.lisp
(file).
suffix-tree.lisp
(file).
pre.lisp
(file).
doc.lisp
(file).
ascii-strings/contrib
ascii-strings
(system).
ascii-strings.lisp
(file).
Files are sorted by type and then listed depth-first from the systems components trees.
cl-string-match/cl-string-match.asd
ascii-strings/ascii-strings.asd
cl-string-match/src/package.lisp
cl-string-match/src/utils.lisp
cl-string-match/src/brute-force.lisp
cl-string-match/src/boyer-moore.lisp
cl-string-match/src/boyer-moore-horspool.lisp
cl-string-match/src/rabin-karp.lisp
cl-string-match/src/knuth-morris-pratt.lisp
cl-string-match/src/shift-or.lisp
cl-string-match/src/aho-corasick.lisp
cl-string-match/src/compiled-aho-corasick.lisp
cl-string-match/src/suffix-tree.lisp
cl-string-match/src/pre.lisp
cl-string-match/src/doc.lisp
ascii-strings/contrib/ascii-strings.lisp
cl-string-match/cl-string-match.asd
cl-string-match
(system).
ascii-strings/ascii-strings.asd
ascii-strings
(system).
cl-string-match/src/package.lisp
src
(module).
*standard-debug-settings*
(special variable).
*standard-optimize-settings*
(special variable).
format-name
(function).
cl-string-match/src/utils.lisp
package.lisp
(file).
src
(module).
prefixed-with
(function).
suffixed-with
(function).
cl-string-match/src/brute-force.lisp
utils.lisp
(file).
src
(module).
define-brute-matcher
(macro).
string-contains-brute
(function).
string-contains-brute-ub
(function).
@brute-force-section
(special variable).
cl-string-match/src/boyer-moore.lisp
brute-force.lisp
(file).
src
(module).
initialize-bm
(function).
search-bm
(function).
string-contains-bm
(function).
@boyer-moore-section
(special variable).
bm
(structure).
bm-bad-char
(reader).
(setf bm-bad-char)
(writer).
bm-good-suffix
(reader).
(setf bm-good-suffix)
(writer).
bm-p
(function).
bm-pat
(reader).
(setf bm-pat)
(writer).
bm-suffixes
(function).
copy-bm
(function).
hash-bm
(function).
initialize-bad-char
(function).
initialize-good-suffix
(function).
make-bm
(function).
cl-string-match/src/boyer-moore-horspool.lisp
boyer-moore.lisp
(file).
src
(module).
bmh8
(structure).
bmh8-p
(function).
bmh8-pat
(reader).
(setf bmh8-pat)
(writer).
define-bmh-matcher
(macro).
initialize-bmh
(function).
initialize-bmh8
(function).
search-bmh
(function).
search-bmh8
(function).
string-contains-bmh
(function).
string-contains-bmh8
(function).
@boyer-moore-horspool-section
(special variable).
bmh
(structure).
bmh-bad-char-skip
(reader).
(setf bmh-bad-char-skip)
(writer).
bmh-p
(function).
bmh-pat
(reader).
(setf bmh-pat)
(writer).
bmh-pat-len
(reader).
(setf bmh-pat-len)
(writer).
bmh8-bad-char-skip
(reader).
(setf bmh8-bad-char-skip)
(writer).
bmh8-pat-len
(reader).
(setf bmh8-pat-len)
(writer).
copy-bmh
(function).
copy-bmh8
(function).
make-bmh
(function).
make-bmh8
(function).
cl-string-match/src/rabin-karp.lisp
boyer-moore-horspool.lisp
(file).
src
(module).
initialize-rk
(function).
search-rk
(function).
string-contains-rk
(function).
+alph-size+
(constant).
+big-prime+
(constant).
@rabin-karp-section
(special variable).
check-rk-lv
(function).
check-rk-mk
(function).
copy-rk
(function).
horner-hash
(function).
make-rk
(function).
rk
(structure).
rk-alph-size
(reader).
(setf rk-alph-size)
(writer).
rk-p
(function).
rk-pat
(reader).
(setf rk-pat)
(writer).
rk-pat-hash
(reader).
(setf rk-pat-hash)
(writer).
rk-pat-len
(reader).
(setf rk-pat-len)
(writer).
rk-rm
(reader).
(setf rk-rm)
(writer).
rk-ub32
(type).
cl-string-match/src/knuth-morris-pratt.lisp
rabin-karp.lisp
(file).
src
(module).
initialize-kmp
(function).
search-kmp
(function).
string-contains-kmp
(function).
@knuth-morris-pratt-section
(special variable).
copy-kmp
(function).
kmp
(structure).
kmp-p
(function).
kmp-pat
(reader).
(setf kmp-pat)
(writer).
kmp-pat-len
(reader).
(setf kmp-pat-len)
(writer).
kmp-table
(reader).
(setf kmp-table)
(writer).
make-kmp
(function).
cl-string-match/src/shift-or.lisp
knuth-morris-pratt.lisp
(file).
src
(module).
initialize-sor
(function).
print-object
(method).
search-sor
(function).
string-contains-sor
(function).
+sor-alphabet+
(constant).
@shift-or-section
(special variable).
copy-sor
(function).
make-sor
(function).
sor
(structure).
sor-cidx
(reader).
(setf sor-cidx)
(writer).
sor-lim
(reader).
(setf sor-lim)
(writer).
sor-p
(function).
sor-pat-len
(reader).
(setf sor-pat-len)
(writer).
sor-printer
(function).
sor-ub32
(type).
cl-string-match/src/aho-corasick.lisp
shift-or.lisp
(file).
src
(module).
empty-trie
(function).
initialize-ac
(function).
initialize-tabac
(function).
print-object
(method).
search-ac
(function).
search-tabac
(function).
string-contains-ac
(function).
string-contains-tabac
(function).
trie->tabular-ac
(function).
trie-add-keyword
(function).
trie-build
(function).
trie-contains
(function).
trie-contains-prefix
(function).
trie-node
(structure).
trie-pprint
(function).
trie-traverse-bfo
(function).
trie-traverse-dfo
(function).
+ac-alphabet+
(constant).
@aho-corasick-section
(special variable).
compute-failure-function
(function).
copy-tabac
(function).
copy-trie-node
(function).
make-tabac
(function).
make-trie-node
(function).
map-trie-children
(macro).
tabac
(structure).
tabac-final
(reader).
(setf tabac-final)
(writer).
tabac-match-len
(reader).
(setf tabac-match-len)
(writer).
tabac-p
(function).
tabac-start
(reader).
(setf tabac-start)
(writer).
tabac-trans
(reader).
(setf tabac-trans)
(writer).
tabac-transition
(function).
trie-add-child
(function).
trie-dump-dot
(function).
trie-dump-dot-file
(function).
trie-find-child
(function).
trie-node-children
(reader).
(setf trie-node-children)
(writer).
trie-node-depth
(reader).
(setf trie-node-depth)
(writer).
trie-node-fail
(reader).
(setf trie-node-fail)
(writer).
trie-node-id
(reader).
(setf trie-node-id)
(writer).
trie-node-label
(reader).
(setf trie-node-label)
(writer).
trie-node-mark
(reader).
(setf trie-node-mark)
(writer).
trie-node-p
(function).
trie-node-printer
(function).
cl-string-match/src/compiled-aho-corasick.lisp
aho-corasick.lisp
(file).
src
(module).
search-compiled-ac
(function).
trie->compiled-ac
(function).
make-lambda-ac
(function).
cl-string-match/src/suffix-tree.lisp
compiled-aho-corasick.lisp
(file).
src
(module).
+infinity+
(constant).
build-suffix-tree-simple
(function).
build-suffix-tree-ukkonen
(function).
make-suffix-node
(function).
make-suffix-tree
(function).
make-ukk-node
(function).
print-object
(method).
print-object
(method).
suffix-node
(structure).
suffix-node-add-child
(function).
suffix-node-children
(reader).
(setf suffix-node-children)
(writer).
suffix-node-end
(reader).
(setf suffix-node-end)
(writer).
suffix-node-equals
(function).
suffix-node-leafp
(function).
suffix-node-map-over-children
(function).
suffix-node-start
(reader).
(setf suffix-node-start)
(writer).
suffix-node-str
(function).
suffix-tree
(structure).
suffix-tree-build-from-sexp
(function).
suffix-tree-char
(function).
suffix-tree-equals
(function).
suffix-tree-root
(reader).
(setf suffix-tree-root)
(writer).
suffix-tree-str
(reader).
(setf suffix-tree-str)
(writer).
suffix-tree-walk
(function).
ukk-node
(structure).
ukk-node-suffix
(reader).
(setf ukk-node-suffix)
(writer).
add-suffix-simple
(function).
branch
(function).
build-suffix-tree-mccreight
(function).
canonize
(function).
copy-suffix-node
(function).
copy-suffix-tree
(function).
copy-ukk-node
(function).
print-suffix-tree-for-gv
(function).
print-suffix-tree-for-gv-pretty
(function).
print-suffix-tree-in-file
(function).
ref-length
(function).
split-node
(function).
stbstr
(function).
str-length
(function).
suffix-node-get-child
(function).
suffix-node-id
(reader).
(setf suffix-node-id)
(writer).
suffix-node-insert-child
(function).
suffix-node-p
(function).
suffix-node-parent
(reader).
(setf suffix-node-parent)
(writer).
suffix-node-printer
(function).
suffix-tree-nodes-count
(reader).
(setf suffix-tree-nodes-count)
(writer).
suffix-tree-p
(function).
suffix-tree-printer
(function).
ukk-node-children
(function).
(setf ukk-node-children)
(function).
ukk-node-end
(function).
(setf ukk-node-end)
(function).
ukk-node-id
(function).
(setf ukk-node-id)
(function).
ukk-node-p
(function).
ukk-node-parent
(function).
(setf ukk-node-parent)
(function).
ukk-node-start
(function).
(setf ukk-node-start)
(function).
ukkonen-update
(function).
cl-string-match/src/pre.lisp
suffix-tree.lisp
(file).
src
(module).
compile-re
(function).
find-re
(function).
make-load-form
(method).
match-groups
(reader method).
match-pos-end
(reader method).
match-pos-start
(reader method).
match-re
(function).
match-string
(reader method).
print-object
(method).
print-object
(method).
re
(class).
re-match
(class).
replace-re
(function).
split-re
(function).
with-re
(macro).
with-re-match
(macro).
*expression-parser*
(special variable).
*set-parser*
(special variable).
@pre-basic-matching-section
(special variable).
@pre-compiling-patterns-section
(special variable).
@pre-groups-section
(special variable).
@pre-pattern-replacing-section
(special variable).
@pre-pattern-scanning-section
(special variable).
@pre-pattern-splitting-section
(special variable).
@pre-regexp-section
(special variable).
@pre-with-re-match-section
(special variable).
compile-set
(function).
copy-re-thread
(function).
escape
(function).
hex-char-p
(function).
make-re-thread
(function).
newline-p
(function).
punctuation-p
(function).
re-expression
(reader method).
re-pattern
(reader method).
re-thread
(structure).
re-thread-groups
(reader).
(setf re-thread-groups)
(writer).
re-thread-p
(function).
re-thread-pc
(reader).
(setf re-thread-pc)
(writer).
re-thread-sp
(reader).
(setf re-thread-sp)
(writer).
re-thread-stack
(reader).
(setf re-thread-stack)
(writer).
run
(function).
space-p
(function).
tab-p
(function).
word-char-p
(function).
cl-string-match/src/doc.lisp
pre.lisp
(file).
src
(module).
@cl-string-match-manual
(special variable).
@multi-pattern-search
(special variable).
@regexp-pattern-search
(special variable).
@single-pattern-search
(special variable).
print-markdown-footer
(function).
update-md
(function).
ascii-strings/contrib/ascii-strings.lisp
contrib
(module).
ascii-char-code
(function).
make-substitution-table
(function).
make-ub-buffer
(function).
make-ub-line-reader
(function).
make-ub-string
(function).
octets-to-ub
(function).
string-to-ub
(function).
ub-alpha-char-p
(function).
ub-alphanumericp
(function).
ub-both-case-p
(function).
ub-buffer
(type).
ub-char
(function).
ub-char
(type).
ub-char-code
(function).
ub-char-code-limit
(constant).
ub-char-downcase
(function).
ub-char-equal
(function).
ub-char-greaterp
(function).
ub-char-int
(function).
ub-char-lessp
(function).
ub-char-name
(function).
ub-char-not-equal
(function).
ub-char-not-greaterp
(function).
ub-char-not-lessp
(function).
ub-char-upcase
(function).
ub-char/=
(function).
ub-char<
(function).
ub-char<=
(function).
ub-char=
(function).
ub-char>
(function).
ub-char>=
(function).
ub-code-char
(function).
ub-digit-char
(function).
ub-digit-char-p
(function).
ub-graphic-char-p
(function).
ub-line-reader-close
(function).
ub-line-reader-file-position
(function).
ub-lower-case-p
(function).
ub-name-char
(function).
ub-nstring-capitalize
(function).
ub-nstring-downcase
(function).
ub-nstring-upcase
(function).
ub-read-line
(function).
ub-read-line-raw
(function).
ub-read-line-string
(function).
ub-standard-char-p
(function).
ub-string
(type).
ub-string-capitalize
(function).
ub-string-downcase
(function).
ub-string-equal
(function).
ub-string-greaterp
(function).
ub-string-left-trim
(function).
ub-string-lessp
(function).
ub-string-not-equal
(function).
ub-string-not-greaterp
(function).
ub-string-not-lessp
(function).
ub-string-right-trim
(function).
ub-string-trim
(function).
ub-string-upcase
(function).
ub-string/=
(function).
ub-string<
(function).
ub-string<=
(function).
ub-string=
(function).
ub-string>
(function).
ub-string>=
(function).
ub-subseq
(function).
ub-to-string
(function).
ub-upper-case-p
(function).
%ub-string-compare
(function).
*default-substitution-table*
(special variable).
+newline+
(constant).
+ub-line-reader-buffer-size+
(constant).
copy-ub-line-reader
(function).
ub-line-reader
(structure).
ub-line-reader-buffer
(reader).
(setf ub-line-reader-buffer)
(writer).
ub-line-reader-eof
(reader).
(setf ub-line-reader-eof)
(writer).
ub-line-reader-fill
(reader).
(setf ub-line-reader-fill)
(writer).
ub-line-reader-p
(function).
ub-line-reader-pos
(reader).
(setf ub-line-reader-pos)
(writer).
ub-line-reader-stream
(reader).
(setf ub-line-reader-stream)
(writer).
ub-schar
(function).
ub-string<>=*-body
(macro).
Packages are listed by definition order.
ascii-strings
This library implements functions and data types
similar to the standard Common Lisp functions and types but prefixed
with ub- to avoid naming conflicts.
This package aims at providing single-byte strings functionality
for Unicode-enabled Common Lisp implementations. Another aim is to
reduce memory footprint and boost performance of the
string-processing algorithms.
There are similar libraries/packages with slight differences. Check,
for instance, com.informatimago.common-lisp.cesarum.ascii.
This package also provides a faster alternative to the standard
read-line function. A line reader is created by the
make-ub-line-reader function, an ub-string is read by the
ub-read-line, and a standard line can be read by the
ub-read-line-string.
Please note, that while ASCII uses 7-bits per character, this library works with octets, using 8-bits per character.
ascii
alexandria
.
common-lisp
.
ascii-char-code
(function).
make-substitution-table
(function).
make-ub-buffer
(function).
make-ub-line-reader
(function).
make-ub-string
(function).
octets-to-ub
(function).
string-to-ub
(function).
ub-alpha-char-p
(function).
ub-alphanumericp
(function).
ub-both-case-p
(function).
ub-buffer
(type).
ub-char
(function).
ub-char
(type).
ub-char-code
(function).
ub-char-code-limit
(constant).
ub-char-downcase
(function).
ub-char-equal
(function).
ub-char-greaterp
(function).
ub-char-int
(function).
ub-char-lessp
(function).
ub-char-name
(function).
ub-char-not-equal
(function).
ub-char-not-greaterp
(function).
ub-char-not-lessp
(function).
ub-char-upcase
(function).
ub-char/=
(function).
ub-char<
(function).
ub-char<=
(function).
ub-char=
(function).
ub-char>
(function).
ub-char>=
(function).
ub-code-char
(function).
ub-digit-char
(function).
ub-digit-char-p
(function).
ub-graphic-char-p
(function).
ub-line-reader-close
(function).
ub-line-reader-file-position
(function).
ub-lower-case-p
(function).
ub-name-char
(function).
ub-nstring-capitalize
(function).
ub-nstring-downcase
(function).
ub-nstring-upcase
(function).
ub-read-line
(function).
ub-read-line-raw
(function).
ub-read-line-string
(function).
ub-standard-char-p
(function).
ub-string
(type).
ub-string-capitalize
(function).
ub-string-downcase
(function).
ub-string-equal
(function).
ub-string-greaterp
(function).
ub-string-left-trim
(function).
ub-string-lessp
(function).
ub-string-not-equal
(function).
ub-string-not-greaterp
(function).
ub-string-not-lessp
(function).
ub-string-right-trim
(function).
ub-string-trim
(function).
ub-string-upcase
(function).
ub-string/=
(function).
ub-string<
(function).
ub-string<=
(function).
ub-string=
(function).
ub-string>
(function).
ub-string>=
(function).
ub-subseq
(function).
ub-to-string
(function).
ub-upper-case-p
(function).
%ub-string-compare
(function).
*default-substitution-table*
(special variable).
+newline+
(constant).
+ub-line-reader-buffer-size+
(constant).
copy-ub-line-reader
(function).
ub-line-reader
(structure).
ub-line-reader-buffer
(reader).
(setf ub-line-reader-buffer)
(writer).
ub-line-reader-eof
(reader).
(setf ub-line-reader-eof)
(writer).
ub-line-reader-fill
(reader).
(setf ub-line-reader-fill)
(writer).
ub-line-reader-p
(function).
ub-line-reader-pos
(reader).
(setf ub-line-reader-pos)
(writer).
ub-line-reader-stream
(reader).
(setf ub-line-reader-stream)
(writer).
ub-schar
(function).
ub-string<>=*-body
(macro).
cl-string-match
String matching functions and methods.
sm
alexandria
.
ascii-strings
.
common-lisp
.
iterate
.
mgl-pax
.
+infinity+
(constant).
bmh8
(structure).
bmh8-p
(function).
bmh8-pat
(reader).
(setf bmh8-pat)
(writer).
build-suffix-tree-simple
(function).
build-suffix-tree-ukkonen
(function).
compile-re
(function).
define-bmh-matcher
(macro).
define-brute-matcher
(macro).
empty-trie
(function).
find-re
(function).
initialize-ac
(function).
initialize-bm
(function).
initialize-bmh
(function).
initialize-bmh8
(function).
initialize-kmp
(function).
initialize-rk
(function).
initialize-sor
(function).
initialize-tabac
(function).
make-suffix-node
(function).
make-suffix-tree
(function).
make-ukk-node
(function).
match-groups
(generic reader).
match-pos-end
(generic reader).
match-pos-start
(generic reader).
match-re
(function).
match-string
(generic reader).
prefixed-with
(function).
re
(class).
re-match
(class).
replace-re
(function).
search-ac
(function).
search-bm
(function).
search-bmh
(function).
search-bmh8
(function).
search-compiled-ac
(function).
search-kmp
(function).
search-rk
(function).
search-sor
(function).
search-tabac
(function).
split-re
(function).
string-contains-ac
(function).
string-contains-bm
(function).
string-contains-bmh
(function).
string-contains-bmh8
(function).
string-contains-brute
(function).
string-contains-brute-ub
(function).
string-contains-kmp
(function).
string-contains-rk
(function).
string-contains-sor
(function).
string-contains-tabac
(function).
suffix-node
(structure).
suffix-node-add-child
(function).
suffix-node-children
(reader).
(setf suffix-node-children)
(writer).
suffix-node-end
(reader).
(setf suffix-node-end)
(writer).
suffix-node-equals
(function).
suffix-node-leafp
(function).
suffix-node-map-over-children
(function).
suffix-node-start
(reader).
(setf suffix-node-start)
(writer).
suffix-node-str
(function).
suffix-tree
(structure).
suffix-tree-build-from-sexp
(function).
suffix-tree-char
(function).
suffix-tree-equals
(function).
suffix-tree-root
(reader).
(setf suffix-tree-root)
(writer).
suffix-tree-str
(reader).
(setf suffix-tree-str)
(writer).
suffix-tree-walk
(function).
suffixed-with
(function).
trie->compiled-ac
(function).
trie->tabular-ac
(function).
trie-add-keyword
(function).
trie-build
(function).
trie-contains
(function).
trie-contains-prefix
(function).
trie-node
(structure).
trie-pprint
(function).
trie-traverse-bfo
(function).
trie-traverse-dfo
(function).
ukk-node
(structure).
ukk-node-suffix
(reader).
(setf ukk-node-suffix)
(writer).
with-re
(macro).
with-re-match
(macro).
*expression-parser*
(special variable).
*set-parser*
(special variable).
*standard-debug-settings*
(special variable).
*standard-optimize-settings*
(special variable).
+ac-alphabet+
(constant).
+alph-size+
(constant).
+big-prime+
(constant).
+sor-alphabet+
(constant).
@aho-corasick-section
(special variable).
@boyer-moore-horspool-section
(special variable).
@boyer-moore-section
(special variable).
@brute-force-section
(special variable).
@cl-string-match-manual
(special variable).
@knuth-morris-pratt-section
(special variable).
@multi-pattern-search
(special variable).
@pre-basic-matching-section
(special variable).
@pre-compiling-patterns-section
(special variable).
@pre-groups-section
(special variable).
@pre-pattern-replacing-section
(special variable).
@pre-pattern-scanning-section
(special variable).
@pre-pattern-splitting-section
(special variable).
@pre-regexp-section
(special variable).
@pre-with-re-match-section
(special variable).
@rabin-karp-section
(special variable).
@regexp-pattern-search
(special variable).
@shift-or-section
(special variable).
@single-pattern-search
(special variable).
add-suffix-simple
(function).
bm
(structure).
bm-bad-char
(reader).
(setf bm-bad-char)
(writer).
bm-good-suffix
(reader).
(setf bm-good-suffix)
(writer).
bm-p
(function).
bm-pat
(reader).
(setf bm-pat)
(writer).
bm-suffixes
(function).
bmh
(structure).
bmh-bad-char-skip
(reader).
(setf bmh-bad-char-skip)
(writer).
bmh-p
(function).
bmh-pat
(reader).
(setf bmh-pat)
(writer).
bmh-pat-len
(reader).
(setf bmh-pat-len)
(writer).
bmh8-bad-char-skip
(reader).
(setf bmh8-bad-char-skip)
(writer).
bmh8-pat-len
(reader).
(setf bmh8-pat-len)
(writer).
branch
(function).
build-suffix-tree-mccreight
(function).
canonize
(function).
check-rk-lv
(function).
check-rk-mk
(function).
compile-set
(function).
compute-failure-function
(function).
copy-bm
(function).
copy-bmh
(function).
copy-bmh8
(function).
copy-kmp
(function).
copy-re-thread
(function).
copy-rk
(function).
copy-sor
(function).
copy-suffix-node
(function).
copy-suffix-tree
(function).
copy-tabac
(function).
copy-trie-node
(function).
copy-ukk-node
(function).
escape
(function).
format-name
(function).
hash-bm
(function).
hex-char-p
(function).
horner-hash
(function).
initialize-bad-char
(function).
initialize-good-suffix
(function).
kmp
(structure).
kmp-p
(function).
kmp-pat
(reader).
(setf kmp-pat)
(writer).
kmp-pat-len
(reader).
(setf kmp-pat-len)
(writer).
kmp-table
(reader).
(setf kmp-table)
(writer).
make-bm
(function).
make-bmh
(function).
make-bmh8
(function).
make-kmp
(function).
make-lambda-ac
(function).
make-re-thread
(function).
make-rk
(function).
make-sor
(function).
make-tabac
(function).
make-trie-node
(function).
map-trie-children
(macro).
newline-p
(function).
print-markdown-footer
(function).
print-suffix-tree-for-gv
(function).
print-suffix-tree-for-gv-pretty
(function).
print-suffix-tree-in-file
(function).
punctuation-p
(function).
re-expression
(generic reader).
re-pattern
(generic reader).
re-thread
(structure).
re-thread-groups
(reader).
(setf re-thread-groups)
(writer).
re-thread-p
(function).
re-thread-pc
(reader).
(setf re-thread-pc)
(writer).
re-thread-sp
(reader).
(setf re-thread-sp)
(writer).
re-thread-stack
(reader).
(setf re-thread-stack)
(writer).
ref-length
(function).
rk
(structure).
rk-alph-size
(reader).
(setf rk-alph-size)
(writer).
rk-p
(function).
rk-pat
(reader).
(setf rk-pat)
(writer).
rk-pat-hash
(reader).
(setf rk-pat-hash)
(writer).
rk-pat-len
(reader).
(setf rk-pat-len)
(writer).
rk-rm
(reader).
(setf rk-rm)
(writer).
rk-ub32
(type).
run
(function).
sor
(structure).
sor-cidx
(reader).
(setf sor-cidx)
(writer).
sor-lim
(reader).
(setf sor-lim)
(writer).
sor-p
(function).
sor-pat-len
(reader).
(setf sor-pat-len)
(writer).
sor-printer
(function).
sor-ub32
(type).
space-p
(function).
split-node
(function).
stbstr
(function).
str-length
(function).
suffix-node-get-child
(function).
suffix-node-id
(reader).
(setf suffix-node-id)
(writer).
suffix-node-insert-child
(function).
suffix-node-p
(function).
suffix-node-parent
(reader).
(setf suffix-node-parent)
(writer).
suffix-node-printer
(function).
suffix-tree-nodes-count
(reader).
(setf suffix-tree-nodes-count)
(writer).
suffix-tree-p
(function).
suffix-tree-printer
(function).
tab-p
(function).
tabac
(structure).
tabac-final
(reader).
(setf tabac-final)
(writer).
tabac-match-len
(reader).
(setf tabac-match-len)
(writer).
tabac-p
(function).
tabac-start
(reader).
(setf tabac-start)
(writer).
tabac-trans
(reader).
(setf tabac-trans)
(writer).
tabac-transition
(function).
trie-add-child
(function).
trie-dump-dot
(function).
trie-dump-dot-file
(function).
trie-find-child
(function).
trie-node-children
(reader).
(setf trie-node-children)
(writer).
trie-node-depth
(reader).
(setf trie-node-depth)
(writer).
trie-node-fail
(reader).
(setf trie-node-fail)
(writer).
trie-node-id
(reader).
(setf trie-node-id)
(writer).
trie-node-label
(reader).
(setf trie-node-label)
(writer).
trie-node-mark
(reader).
(setf trie-node-mark)
(writer).
trie-node-p
(function).
trie-node-printer
(function).
ukk-node-children
(function).
(setf ukk-node-children)
(function).
ukk-node-end
(function).
(setf ukk-node-end)
(function).
ukk-node-id
(function).
(setf ukk-node-id)
(function).
ukk-node-p
(function).
ukk-node-parent
(function).
(setf ukk-node-parent)
(function).
ukk-node-start
(function).
(setf ukk-node-start)
(function).
ukkonen-update
(function).
update-md
(function).
word-char-p
(function).
Definitions are sorted by export status, category, package, and then by lexicographic order.
Maximum number of UB-CHARs is limited by a single byte. That is: just 256 characters are possible.
Compile pattern if it’s not a RE object and execute body.
Intern match symbols to execute a body.
This is like the standard CHAR-CODE but returns an octet.
Build a Suffix tree for the given string STR using naive (brute-force) algorithm.
Naive algorithm takes O(m²) time to build a suffix tree where m is the
given string length.
Essentially it operates as follows:
while suffices remain:
add next shortest suffix to the tree
Algorithm first ads a suffix Str[1..m] (entire string) to the
tree. Then it proceeds adding suffices Str[i..m] (i=2..m) to the tree.
Build a Suffix tree for the given string STR using Ukkonen’s
algorithm.
Ukkonen’s algorithm takes O(m) time to build a suffix tree where m is
the given string length.
Ukkonen’s algorithm is an on-line algorithm and can operate on a stream of characters, adding one character at a time.
Create a regular expression from a pattern string.
Creates a new instance and returns an empty trie.
Find a regexp pattern match somewhere in a string. Run from an offset.
Returns a Trie that is used to look for the given patterns in the text. It can deal either with a single pattern or a list of patterns.
Preprocess the needle.
Initialize the table to default value.
Preprocess the needle.
Initialize the table to default value.
Returns a tabular FDA that is used to look for the given patterns in the text. It can deal either with a single pattern or a list of patterns.
Allocate an ub-buffer of the given size.
Test a pattern re against a string.
Convert a simple vector into an octets vector (ub-string).
Returns T if the given string ‘TXT‘ is prefixed (starts with) the given prefix ‘PREF‘.
Replace patterns found within a string with a new value.
Looks for patterns that are indexed in the given trie and returns two values: start position of the first matching pattern and its index.
Search for pattern bm in txt.
Search for pattern defined in the IDX in TXT.
Search for pattern defined in the IDX in TXT.
Implementation of the Rabin-Karp substring search algorithm.
Split a string into one or more strings by pattern match.
Looks for the given pattern in the text and returns index of the first occurence.
A Brute-force substring search implementation.
Brute-force substring search requires O(N x M) character compares to
search for a pattern of length M in a text of length N, in the worst
case.
Algorithm described in: Chapter 5, p. 760 in
’Algorithms’, Robert Sedgewick and Kevin Wayne. 4th
A Brute-force substring search implementation.
Brute-force substring search requires O(N x M) character compares to
search for a pattern of length M in a text of length N, in the worst
case.
Algorithm described in: Chapter 5, p. 760 in
’Algorithms’, Robert Sedgewick and Kevin Wayne. 4th
Looks for the given pattern in the text and returns index of the first occurence.
Convert a standard Lisp String into an octets vector.
Adds a child node to the given NODE. Depending on the type of the given node creates either Ukkonens suffix tree or simple suffix tree node.
end
.
Retruns T is the given node is a leaf (has no children), return NIL otherwise.
Applies the given function to every child of the given node.
Returns the string that corresponds to the edge pointing to this node.
Build the suffix tree from the given sexp representation. Sexp is in form (begin end (children)).
Checks all the nodes in the given trees and returns T if trees are equal or NIL otherwise.
root
.
str
.
Applies the given FUNCTION to every node in the given TREE.
Returns T if the given string ‘TXT‘ is suffixed (ends with) the given suffix ‘SUFF‘.
Returns a compiled function for the given Aho-Corasick trie.
Given an Aho-Corasick trie and transforms it into a table-based DFA for the Aho-Corasick automata.
Starting at the root, follow the path labeled by chars of Pi
If the path ends before Pi, continue it by adding new edges and nodes
for the remaining characters of Pi.
Store identifier i of Pi at the terminal node of the path.
Builds a Trie based on the given list of patterns.
Returns T if the given TRIE contains the given string S.
Checks if the given TRIE contains some prefix of the given string S.
Returns the length of the matched prefix.
Traverse the given trie and pretty-print it to the given stream.
ROOT is the root node (optional).
Traverse the trie in the Breadth-First-Order and call the given handler function on each node..
Traverse the trie in the Depth-First-Order and call the given handler function on each node.
Returns true if character is an alphabetic character or a numeric character; otherwise, returns false.
Returns a code of the ub-char. Since ub-char is a number (an octet), it is essentially an identity function.
Tests whether char is a digit in the specified radix (i.e., with a weight less than radix). If it is a digit in that radix, its weight is returned as an integer; otherwise nil is returned.
Mimics a standard close function: closes the underlying stream and resets the reader to its initial state.
Returns the current position within the stream according to the
amount of information really read.
When the buffer caches more information than was really read by one of
UB-READ-LINE function the standard FILE-POSITION function will return
position of the buffer that is larger than the position that was read
by the user.
Returned number can be used by the standard FILE-POSITION function to
adjust the position within a stream.
When optional argument POSITION is supplied, the file position is adjusted accordingly in the underlying stream. The buffer is flushed.
Reads data into a pre-allocated buffer in the reader structure and returns an array displaced to the contents of the next line in that buffer.
Reads data into the pre-allocated buffer in the READER structure
and returns two values: start and end positions of the line within the
buffer that can be used to extract this line contents from the
buffer.
Please note, that unlike the standard read-line or the liberal-read-line by jasonmelbye this function works with the Unix-type of lines - sequence of characters delimited by the Newline symbol.
Reads data into a pre-allocated buffer in the reader structure and returns a standard Lisp string.
Returns true if the supplied substrings are different; otherwise it is false.
Returns true if substring1 is less than substring2; otherwise it is false.
Returns true if substring1 is less than or equal to substring2; otherwise it is false.
Returns true if the supplied substrings are of the same length and contain the same characters in corresponding positions; otherwise it returns false.
Returns true if substring1 is greater than substring2; otherwise it is false.
Returns true if substring1 is greater than or equal to substring2; otherwise it is false.
Returns either the SEQUENCE itself, or a dispaced array to the SEQUENCE. This is meant as a memory-efficient replacement for the ordinary SUBSEQ that allocates a new sequence.
Converts either an UB-STRING or UB-BUFFER into a standard Common
Lisp string.
START, END the start and end offsets within the given USTR to translate into a standard string.
re
) &optional env) ¶Tell the system how to save and load a regular expression to a FASL.
suffix-tree
) stream) ¶suffix-node
) stream) ¶Documentation
structure-object
.
fixnum
0
fixnum
0
fixnum
0
ROOT is a SUFFIX-NODE with default values. It is intended to keep references to its children. ROOT is just a simple placeholder for the actual nodes as its children.
Each node of a trie contains a list of child nodes, a label (the
letter) and a mark (some value attributed to the matching string).
Trie root node is like all other nodes but its ‘ID‘ is used as an
increment to create ids for new nodes.
Slots:
* ‘id‘ - unique node identifier, root node has the largest id.
* ‘children‘ - a hash table with labels as keys, ‘trie-node‘ as
values.
* ‘mark‘ - output function, when not null marks the last character of a
keyword and is returned as the search result.
* ‘fail‘ - fail transition to another node.
* ‘depth‘ - number of nodes from the root node to this node.
structure-object
.
fixnum
0
(or null cl-string-match:trie-node)
fixnum
0
Ukkonen’s algorithm relies on the suffix link technique. Some other
algorithms might also rely on it but the naive suffix tree algorithm
does not require it.
This structure extends the standard suffix tree node structure with the suffix link slot.
Regular expression.
Matched pattern.
:match
This slot is read-only.
:groups
This slot is read-only.
:start-pos
This slot is read-only.
:end-pos
This slot is read-only.
Buffer of octets is a simple array to allow compiler optimizations.
An octet. A number equal to the char code.
Strings must be vectors to allow for fill pointers and displacement. However, SBCL does a better job on optimizations when it deals with simple arrays.
Size of the alphabet for the Shift-OR algorithm implementation.
Defines the size of the buffer used by the line reader in bytes.
Substitution table used to convert ub-strings to the standard strings by default. Since character with code 0 is not very welcome in the world of C, we are converting it to an ordinary character.
The standard debug settings to be used in functions under development.
The standard optimize settings used by most declaration expressions. Tuned for best performance by default, but when the SM-DEBUG-ENABLED keyword is present in the *FEATURES* list, makes debug and safety its priority at the expense of everything else.
Perform the given BODY on the children of the given PARENT. The
child node is bound to the CHILD variable.
Example:
(map-trie-children (node child)
(push child stack))
Makes a body of the string comparison functions. Borrowed from the string.lisp file of the SBCL sources.
Adds a suffix of a string STR designated by its START and END to the suffix tree TREE.
Function branch is used to test if a position is the end point and turn the implicit node to explicit node if necessary. Because sentinel node is not used, the special case is handled in the first if-clause.
Build a Suffix tree for the given string STR using McCreight’s
algorithm.
McCreight’s algorithm takes O(n) time to build a suffix tree where n
is the given string length.
TODO
Las Vegas version: does pat[] match txt[i..i-M+1] ?
Monte Carlo version: always return true
Create a single satisfy predicate for a character set.
Given a trie calculate failure transitions for its nodes.
Modifies nodes.
Traverses the trie in the breadth-first-order (BFO)
Return the test and predicate for an escaped character.
T if c is a hexadecimal character.
Horner hashing function implementation.
Computes the hash function for an END-digit base- +ALPH-SIZE+ number represented as a char array in time proportional to END. (We pass END as an argument so that we can use the function for both the pattern and the text.)
Initialize the bad character skip for the Boyer-Moore algorithm.
Initialize the good suffix skip function table for the Boyer-Moore algorithm.
Writes lambda function performing search over the given trie.
That function can later be compiled into native code and
funcall-ed. The generated function accepts a single string and returns
matching mark from the given trie.
Generated function is essentially a giant tagbody with jumps dispatched in case forms depending on the current character.
T if c is a newline character.
Print the given suffix TREE for the Graphviz visualization in the dot format.
Alternative representation of the suffix tree. Based on java implementation from: http://pastie.org/5925812
T if c is a punctuation character.
pc
.
sp
.
Execute a regular expression program.
Make dump of the SOR structure more human-readable: in the CIDX print characters and their positions, decipher LIM.
T if c is a whitespace character.
Split the given NODE at position POS within the node.
The +infinity+ is defined as the largest number (standard constant MOST-POSITIVE-FIXNUM), if the right index exceed to the length of the list, the result is from left to the end of the list.
Length of the substring in the arc.
Given the NODE lookup a child node that starts with the given char C in the given suffix tree TREE.
id
.
Adds a child node to the given NODE.
T if c is a tab character.
Returns a new state based on the given char from the current state.
Add a child node to the given node with the given label and mark.
Constructor can be either ‘MAKE-TRIE-NODE‘ or any other structure constructor derieved from the ‘TRIE-NODE‘ struct.
Dumps a textual representation of the Trie useful for making a graphical plot with Graphviz to the given stream.
Dumps output of the TRIE-DUMP-DOT function to the file with the specified file name.
Looks for a child of the given node trie with the given label.
fail
.
id
.
mark
.
We have to avoid standard Lisp printer because of the FAIL links that turn our tree into a network with cycles, throwing the default printer into an infinite loop.
eof
.
fill
.
pos
.
T if is alphanumeric or an underscore.
structure-object
.
(simple-array fixnum)
(make-array 0 :element-type (quote fixnum))
(simple-array fixnum)
(make-array 0 :element-type (quote fixnum))
simple-string
""
structure-object
.
simple-string
""
cl-string-match::rk-ub32
0
cl-string-match::rk-ub32
0
cl-string-match::rk-ub32
cl-string-match::+alph-size+
cl-string-match::rk-ub32
1
Index for the Shift-OR search operation.
Defines an Aho-Corasick DFA based on tables.
Slots:
* ‘start‘ - initial state
* ‘trans‘ - a table of transition tables
* ‘final‘ - marks final states
* ‘match-len‘ - stores length of the keyword corresponding to the final state
structure-object
.
fixnum
-1
(or null (simple-array (or null (simple-array fixnum))))
(simple-array fixnum)
(make-array 0 :element-type (quote fixnum))
(simple-array fixnum)
(make-array 0 :element-type (quote fixnum))
Encapsulates a buffer and the state of the line reader. Created by
the make-ub-line-reader function, and the every next line is obtained
by one of the ub-read-line-* functions.
You should not forget to close the underlying stream using either a
standard close function or the ub-line-reader-close function.
It is anticipated that the underlying stream is an input stream with element-type set to ub-char.
structure-object
.
common-lisp
.
(or null stream)
ascii-strings:ub-buffer
(make-array ascii-strings::+ub-line-reader-buffer-size+ :element-type (quote ascii-strings:ub-char))
common-lisp
.
fixnum
0
fixnum
0
boolean
Jump to: | %
(
A B C D E F G H I K M N O P R S T U W |
---|
Jump to: | %
(
A B C D E F G H I K M N O P R S T U W |
---|
Jump to: | *
+
@
A B C D E F G I L M N P R S T U |
---|
Jump to: | *
+
@
A B C D E F G I L M N P R S T U |
---|
Jump to: | A B C D F K M P R S T U |
---|
Jump to: | A B C D F K M P R S T U |
---|