Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]
This is the one-more-re-nightmare Reference Manual, generated automatically by Declt version 3.0 "Montgomery Scott" on Sun May 15 05:42:42 2022 GMT+0.
• Introduction | What one-more-re-nightmare is all about | |
• Systems | The systems documentation | |
• Modules | The modules documentation | |
• Files | The files documentation | |
• Packages | The packages documentation | |
• Definitions | The symbols documentation | |
• Indexes | Concepts, functions, variables and data types |
one-more-re-nightmare is a regular expression engine that uses the technique presented in Regular-expression derivatives revisited to interpret and compile regular expressions. We use a few tricks to make matching quite fast:
one-more-re-nightmare-simd
system on SBCL 2.1.10 or
newer with AVX2, we even use vectorised scanning of constant
prefixes of regular expressions.Thanks to Gilbert Baumann for suggesting I use derivatives to compile regular expressions, and then for informing me of how to handle submatching properly, and my discrete mathematics teachers for formally introducing me to finite state machines.
Please see the reference book for how to use one-more-re-nightmare, or an article on the history and theory involved.
While the syntax is admittedly wonky (but somewhat more like how
regular expressions are presented in papers), one-more-re-nightmare
makes its best effort to implement POSIX semantics for matching (as
described in the specification for how regcomp
works
and regular expression
definitions). Any
behaviour contrary to POSIX is a bug.
CL-USER> (let ((s (make-string 1000000 :initial-element #\a)))
(setf (aref s 333333) #\b)
(setf (aref s 555555) #\c)
(the-cost-of-nothing:bench
(all-string-matches "ab|ac" s)))
CL-USER> (let ((s (make-string 1000000 :initial-element #\a)))
(setf (aref s 333333) #\b)
(setf (aref s 555555) #\c)
(the-cost-of-nothing:bench
(cl-ppcre:all-matches-as-strings "ab|ac" s)))
Note that, by nature of calling the Common Lisp compiler, one-more-re-nightmare will take longer to compile a regular expression, so it is better suited for many matching operations with few expressions. It does cache compiled expressions when using the high-level interface, so the initial cost may amortize well over many calls; and constant regular expression strings are compiled at compile-time, with no runtime overhead whatsoever.
| engine | SBCL | Clozure CL | SBCL with AVX2 | ditto, SIMPLE-BASE-STRING | |------------------|-----------|------------|----------------|---------------------------| | o-m-r-n | 0.57ms | 2.93ms | 0.18ms | 55µs | | compilation time | 4.65ms | 3.76ms | 6.82ms | 6.43ms | | cl-ppcre | 22.8ms | 45.3ms | 22.8ms | 21.6ms | | break even after | 209kchars | 88.7kchars | 301kchars | 305kchars |
Next: Modules, Previous: Introduction, Up: Top [Contents][Index]
The main system appears first, followed by any subsystem dependency.
• The one-more-re-nightmare system |
Hayley Patton
BSD 2-clause
A regular expression compiler
one-more-re-nightmare.asd (file)
Modules are listed depth-first from the system components tree.
• The one-more-re-nightmare/dfa-construction module | ||
• The one-more-re-nightmare/compiler module | ||
• The one-more-re-nightmare/interface module |
Next: The one-more-re-nightmare/compiler module, Previous: Modules, Up: Modules [Contents][Index]
package.lisp (file)
one-more-re-nightmare (system)
DFA-construction/
Next: The one-more-re-nightmare/interface module, Previous: The one-more-re-nightmare/dfa-construction module, Up: Modules [Contents][Index]
dfa-construction (module)
one-more-re-nightmare (system)
Compiler/
Previous: The one-more-re-nightmare/compiler module, Up: Modules [Contents][Index]
compiler (module)
one-more-re-nightmare (system)
Interface/
Files are sorted by type and then listed depth-first from the systems components trees.
• Lisp files |
Next: The one-more-re-nightmare/package․lisp file, Previous: Lisp files, Up: Lisp files [Contents][Index]
one-more-re-nightmare.asd
one-more-re-nightmare (system)
Next: The one-more-re-nightmare/dfa-construction/type․lisp file, Previous: The one-more-re-nightmare․asd file, Up: Lisp files [Contents][Index]
one-more-re-nightmare (system)
package.lisp
Next: The one-more-re-nightmare/dfa-construction/sets․lisp file, Previous: The one-more-re-nightmare/package․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/type.lisp
Next: The one-more-re-nightmare/dfa-construction/re-types․lisp file, Previous: The one-more-re-nightmare/dfa-construction/type․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/sets.lisp
Next: The one-more-re-nightmare/dfa-construction/nullable․lisp file, Previous: The one-more-re-nightmare/dfa-construction/sets․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/re-types.lisp
Next: The one-more-re-nightmare/dfa-construction/derivative․lisp file, Previous: The one-more-re-nightmare/dfa-construction/re-types․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/nullable.lisp
Next: The one-more-re-nightmare/dfa-construction/derivative-classes․lisp file, Previous: The one-more-re-nightmare/dfa-construction/nullable․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/derivative.lisp
Next: The one-more-re-nightmare/dfa-construction/empty․lisp file, Previous: The one-more-re-nightmare/dfa-construction/derivative․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/derivative-classes.lisp
Next: The one-more-re-nightmare/dfa-construction/effects․lisp file, Previous: The one-more-re-nightmare/dfa-construction/derivative-classes․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/empty.lisp
re-empty-p (function)
Next: The one-more-re-nightmare/dfa-construction/similar․lisp file, Previous: The one-more-re-nightmare/dfa-construction/empty․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/effects.lisp
Next: The one-more-re-nightmare/dfa-construction/tag-sets․lisp file, Previous: The one-more-re-nightmare/dfa-construction/effects․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/similar.lisp
Next: The one-more-re-nightmare/dfa-construction/make-dfa․lisp file, Previous: The one-more-re-nightmare/dfa-construction/similar․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/tag-sets.lisp
Next: The one-more-re-nightmare/compiler/layout․lisp file, Previous: The one-more-re-nightmare/dfa-construction/tag-sets․lisp file, Up: Lisp files [Contents][Index]
dfa-construction (module)
DFA-construction/make-dfa.lisp
Next: The one-more-re-nightmare/compiler/compilation-strategy․lisp file, Previous: The one-more-re-nightmare/dfa-construction/make-dfa․lisp file, Up: Lisp files [Contents][Index]
compiler (module)
Compiler/layout.lisp
Next: The one-more-re-nightmare/compiler/length-inference․lisp file, Previous: The one-more-re-nightmare/compiler/layout․lisp file, Up: Lisp files [Contents][Index]
compiler (module)
Compiler/compilation-strategy.lisp
Next: The one-more-re-nightmare/compiler/optimize-settings․lisp file, Previous: The one-more-re-nightmare/compiler/compilation-strategy․lisp file, Up: Lisp files [Contents][Index]
compiler (module)
Compiler/length-inference.lisp
Next: The one-more-re-nightmare/compiler/code-generation․lisp file, Previous: The one-more-re-nightmare/compiler/length-inference․lisp file, Up: Lisp files [Contents][Index]
compiler (module)
Compiler/optimize-settings.lisp
Next: The one-more-re-nightmare/interface/syntax․lisp file, Previous: The one-more-re-nightmare/compiler/optimize-settings․lisp file, Up: Lisp files [Contents][Index]
compiler (module)
Compiler/code-generation.lisp
compile-regular-expression (function)
Next: The one-more-re-nightmare/interface/convert-to-bytes․lisp file, Previous: The one-more-re-nightmare/compiler/code-generation․lisp file, Up: Lisp files [Contents][Index]
interface (module)
Interface/syntax.lisp
Next: The one-more-re-nightmare/interface/code-cache․lisp file, Previous: The one-more-re-nightmare/interface/syntax․lisp file, Up: Lisp files [Contents][Index]
interface (module)
Interface/convert-to-bytes.lisp
Next: The one-more-re-nightmare/interface/lint․lisp file, Previous: The one-more-re-nightmare/interface/convert-to-bytes․lisp file, Up: Lisp files [Contents][Index]
interface (module)
Interface/code-cache.lisp
Next: The one-more-re-nightmare/interface/interface․lisp file, Previous: The one-more-re-nightmare/interface/code-cache․lisp file, Up: Lisp files [Contents][Index]
interface (module)
Interface/lint.lisp
Previous: The one-more-re-nightmare/interface/lint․lisp file, Up: Lisp files [Contents][Index]
interface (module)
Interface/interface.lisp
Next: Definitions, Previous: Files, Up: Top [Contents][Index]
Packages are listed by definition order.
• The one-more-re-nightmare package |
package.lisp (file)
common-lisp
Definitions are sorted by export status, category, package, and then by lexicographic order.
• Exported definitions | ||
• Internal definitions |
Next: Internal definitions, Previous: Definitions, Up: Definitions [Contents][Index]
• Exported macros | ||
• Exported compiler macros | ||
• Exported functions |
Next: Exported compiler macros, Previous: Exported definitions, Up: Exported definitions [Contents][Index]
interface.lisp (file)
Next: Exported functions, Previous: Exported macros, Up: Exported definitions [Contents][Index]
interface.lisp (file)
interface.lisp (file)
interface.lisp (file)
interface.lisp (file)
Previous: Exported compiler macros, Up: Exported definitions [Contents][Index]
Returns a list of match vectors of every match
interface.lisp (file)
interface.lisp (file)
code-generation.lisp (file)
Returns the start, end positions and submatches of the first match, or NIL, NIL and NIL
interface.lisp (file)
Returns the first match or NIL
interface.lisp (file)
Previous: Exported definitions, Up: Definitions [Contents][Index]
• Internal constants | ||
• Internal special variables | ||
• Internal macros | ||
• Internal functions | ||
• Internal generic functions | ||
• Internal conditions | ||
• Internal structures | ||
• Internal classes |
Next: Internal special variables, Previous: Internal definitions, Up: Internal definitions [Contents][Index]
sets.lisp (file)
type.lisp (file)
sets.lisp (file)
Next: Internal macros, Previous: Internal constants, Up: Internal definitions [Contents][Index]
tag-sets.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
code-cache.lisp (file)
code-cache.lisp (file)
code-generation.lisp (file)
compilation-strategy.lisp (file)
layout.lisp (file)
derivative.lisp (file)
derivative-classes.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
similar.lisp (file)
nullable.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
compilation-strategy.lisp (file)
re-types.lisp (file)
syntax.lisp (file)
code-generation.lisp (file)
optimize-settings.lisp (file)
make-dfa.lisp (file)
code-cache.lisp (file)
re-types.lisp (file)
type.lisp (file)
tag-sets.lisp (file)
re-types.lisp (file)
Next: Internal functions, Previous: Internal special variables, Up: Internal definitions [Contents][Index]
Set up global tables for testing.
type.lisp (file)
interface.lisp (file)
type.lisp (file)
type.lisp (file)
type.lisp (file)
sets.lisp (file)
Combine the sets A and B by the Boolean operator op, which should be a
valid argument to the BOOLE function. An integer x is member of the
resulting set iff
(logbitp 0 (boole op (if (isum-member x A) 1 0) (if (isum-member x B) 1 0)))
is non-NIL. That way e.g. boole-ior denotes the union.
sets.lisp (file)
interface.lisp (file)
interface.lisp (file)
type.lisp (file)
type.lisp (file)
optimize-settings.lisp (file)
type.lisp (file)
Next: Internal generic functions, Previous: Internal macros, Up: Internal definitions [Contents][Index]
interface.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
code-generation.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
interface.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
similar.lisp (file)
re-types.lisp (file)
compilation-strategy.lisp (file)
make-dfa.lisp (file)
re-types.lisp (file)
similar.lisp (file)
Ensure that we don’t unify POSITION and a variable, or two different variables.
similar.lisp (file)
re-types.lisp (file)
nullable.lisp (file)
syntax.lisp (file)
length-inference.lisp (file)
length-inference.lisp (file)
layout.lisp (file)
make-dfa.lisp (file)
Compute the derivative of a regular expression with regards to the set (i.e. the regular expression should be matched after a character in the set is matched).
derivative.lisp (file)
derivative.lisp (file)
Produce a list of the ’classes’ (sets) of characters that compiling the regular expression would have to dispatch on.
derivative-classes.lisp (file)
effects.lisp (file)
re-types.lisp (file)
syntax.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
code-cache.lisp (file)
code-generation.lisp (file)
Find another state which we can re-use with some transformation, returning that state and the required transformation.
make-dfa.lisp (file)
code-generation.lisp (file)
code-generation.lisp (file)
Manually constant fold out (OR A NIL) to A. The compiler can do this, but generated code looks nicer with folding.
sets.lisp (file)
Replicate any assignments, turning T_n <- s for all s into T^r_n <- T_n for some arbitrary r
tag-sets.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
effects.lisp (file)
tag-sets.lisp (file)
re-types.lisp (file)
re-types.lisp (file)
tag-sets.lisp (file)
re-types.lisp (file)
layout.lisp (file)
layout.lisp (file)
layout.lisp (file)
layout.lisp (file)
layout.lisp (file)
layout.lisp (file)
layout.lisp (file)
layout.lisp (file)
lint.lisp (file)
re-types.lisp (file)
code-generation.lisp (file)
compilation-strategy.lisp (file)
make-dfa.lisp (file)
make-dfa.lisp (file)
layout.lisp (file)
compilation-strategy.lisp (file)
sets.lisp (file)
make-dfa.lisp (file)
tag-sets.lisp (file)
code-cache.lisp (file)
Produce a list of every subset of sets1 and sets2.
derivative-classes.lisp (file)
tag-sets.lisp (file)
tag-sets.lisp (file)
syntax.lisp (file)
(language-of (nullable RE)) = (language-of (both RE (empty-string)))
nullable.lisp (file)
syntax.lisp (file)
make-dfa.lisp (file)
sets.lisp (file)
Is a regular expression basically an empty string?
This is different to NULLABLE, yes, NULLABLE would accept e.g. a* or anything that is a superset of { "" }, but this accepts only the empty string (± tags).
empty.lisp (file)
interface.lisp (file)
make-dfa.lisp (file)
tag-sets.lisp (file)
re-types.lisp (file)
sets.lisp (file)
sets.lisp (file)
sets.lisp (file)
sets.lisp (file)
code-generation.lisp (file)
similar.lisp (file)
Returns the ISUM, that contains only /x/.
sets.lisp (file)
length-inference.lisp (file)
code-cache.lisp (file)
re-types.lisp (file)
interface.lisp (file)
Returns the ISUM, that contains every code point that is in [from, below)
sets.lisp (file)
sets.lisp (file)
sets.lisp (file)
tag-sets.lisp (file)
re-types.lisp (file)
tag-sets.lisp (file)
re-types.lisp (file)
make-dfa.lisp (file)
make-dfa.lisp (file)
make-dfa.lisp (file)
make-dfa.lisp (file)
Make assignments unique, turning T_n <- s for all s into T^r_n <- s
tag-sets.lisp (file)
tag-sets.lisp (file)
re-types.lisp (file)
tag-sets.lisp (file)
code-generation.lisp (file)
code-generation.lisp (file)
Next: Internal conditions, Previous: Internal functions, Up: Internal definitions [Contents][Index]
automatically generated reader method
type.lisp (file)
automatically generated writer method
type.lisp (file)
compilation-strategy.lisp (file)
Compute a list of states to start compiling from.
compilation-strategy.lisp (file)
The lambda list of the function to generate.
compilation-strategy.lisp (file)
A list of macros (at least using including WIN and RESTART) to use for compilation.
compilation-strategy.lisp (file)
append (short method combination)
Options: :most-specific-first
code-generation.lisp (file)
automatically generated reader method
length-inference.lisp (file)
automatically generated writer method
length-inference.lisp (file)
automatically generated reader method
code-generation.lisp (file)
automatically generated writer method
code-generation.lisp (file)
automatically generated reader method
length-inference.lisp (file)
automatically generated writer method
length-inference.lisp (file)
Part of a TAGBODY body used to start running a DFA.
compilation-strategy.lisp (file)
code-generation.lisp (file)
code-generation.lisp (file)
automatically generated reader method
make-dfa.lisp (file)
automatically generated writer method
make-dfa.lisp (file)
automatically generated reader method
make-dfa.lisp (file)
automatically generated writer method
make-dfa.lisp (file)
automatically generated reader method
make-dfa.lisp (file)
automatically generated writer method
make-dfa.lisp (file)
automatically generated reader method
code-generation.lisp (file)
automatically generated reader method
make-dfa.lisp (file)
automatically generated writer method
make-dfa.lisp (file)
compilation-strategy.lisp (file)
code-generation.lisp (file)
automatically generated reader method
code-generation.lisp (file)
automatically generated reader method
code-generation.lisp (file)
lint.lisp (file)
Next: Internal structures, Previous: Internal generic functions, Up: Internal definitions [Contents][Index]
lint.lisp (file)
style-warning (condition)
lint.lisp (file)
style-warning (condition)
warning-group-number (method)
:n
warning-group-number (generic function)
similar.lisp (file)
condition (condition)
Next: Internal classes, Previous: Internal conditions, Up: Internal definitions [Contents][Index]
A structure representing the type and accessors for a vector of some sort.
layout.lisp (file)
structure-object (structure)
(quote (simple-array character 1))
layout-array-type (function)
(setf layout-array-type) (function)
(quote aref)
layout-ref (function)
(setf layout-ref) (function)
(quote code-char)
layout-from-number (function)
(setf layout-from-number) (function)
(quote char-code)
layout-to-number (function)
(setf layout-to-number) (function)
(quote <)
layout-less (function)
(setf layout-less) (function)
(quote <=)
layout-less-or-equal (function)
(setf layout-less-or-equal) (function)
(quote =)
layout-equal (function)
(setf layout-equal) (function)
make-dfa.lisp (file)
structure-object (structure)
transition-class (function)
(setf transition-class) (function)
transition-next-state (function)
(setf transition-next-state) (function)
transition-tags-to-set (function)
(setf transition-tags-to-set) (function)
Previous: Internal structures, Up: Internal definitions [Contents][Index]
re-types.lisp (file)
regular-expression (class)
print-object (method)
re-types.lisp (file)
regular-expression (class)
print-object (method)
A compilation strategy which calls a continuation when a match is found.
compilation-strategy.lisp (file)
strategy (class)
code-generation.lisp (file)
standard-object (class)
(make-hash-table :test (quote equal))
variable-names (generic function)
(make-hash-table :test (quote equal))
state-names (generic function)
0
next-state-name (generic function)
(setf next-state-name) (generic function)
:variable-map
variable-map (generic function)
re-types.lisp (file)
regular-expression (class)
print-object (method)
re-types.lisp (file)
regular-expression (class)
print-object (method)
re-types.lisp (file)
regular-expression (class)
print-object (method)
re-types.lisp (file)
regular-expression (class)
print-object (method)
re-types.lisp (file)
regular-expression (class)
print-object (method)
re-types.lisp (file)
regular-expression (class)
print-object (method)
length-inference.lisp (file)
standard-object (class)
state (class)
predecessors (generic function)
(setf predecessors) (generic function)
minimum-length (generic function)
(setf minimum-length) (generic function)
re-types.lisp (file)
regular-expression (class)
print-object (method)
type.lisp (file)
standard-object (class)
one-more-re-nightmare::+uncomputed+
cached-nullable (generic function)
(setf cached-nullable) (generic function)
one-more-re-nightmare::+uncomputed+
cached-nullable-no-gensym (generic function)
(setf cached-nullable-no-gensym) (generic function)
one-more-re-nightmare::+uncomputed+
cached-used-tags (generic function)
(setf cached-used-tags) (generic function)
one-more-re-nightmare::+uncomputed+
cached-tags (generic function)
(setf cached-tags) (generic function)
one-more-re-nightmare::+uncomputed+
cached-removed-tags (generic function)
(setf cached-removed-tags) (generic function)
one-more-re-nightmare::+uncomputed+
cached-has-tags-p (generic function)
(setf cached-has-tags-p) (generic function)
A compilation strategy which runs a regular expression vector over every position.
compilation-strategy.lisp (file)
strategy (class)
make-dfa.lisp (file)
:exit-map
state-exit-map (generic function)
(setf state-exit-map) (generic function)
:exit-effects
state-exit-effects (generic function)
(setf state-exit-effects) (generic function)
:expression
state-expression (generic function)
(setf state-expression) (generic function)
(quote nil)
state-transitions (generic function)
(setf state-transitions) (generic function)
A compilation strategy describes how potential matches should be searched for.
compilation-strategy.lisp (file)
standard-object (class)
re-types.lisp (file)
regular-expression (class)
print-object (method)
Previous: Definitions, Up: Top [Contents][Index]
• Concept index | ||
• Function index | ||
• Variable index | ||
• Data type index |
Next: Function index, Previous: Indexes, Up: Indexes [Contents][Index]
Jump to: | F L M O |
---|
Jump to: | F L M O |
---|
Next: Variable index, Previous: Concept index, Up: Indexes [Contents][Index]
Jump to: | %
(
A B C D E F G H I J K L M N P R S T U V W |
---|
Jump to: | %
(
A B C D E F G H I J K L M N P R S T U V W |
---|
Next: Data type index, Previous: Function index, Up: Indexes [Contents][Index]
Jump to: | %
*
+
A C E F H L M N P R S T V |
---|
Jump to: | %
*
+
A C E F H L M N P R S T V |
---|
Previous: Variable index, Up: Indexes [Contents][Index]
Jump to: | A B C E G I J K L N O P R S T |
---|
Jump to: | A B C E G I J K L N O P R S T |
---|