This is the cl-ppcre Reference Manual, version 2.1.1, generated automatically by Declt version 4.0 beta 2 "William Riker" on Sun Sep 15 04:21:19 2024 GMT+0.
cl-ppcre/cl-ppcre.asd
cl-ppcre/packages.lisp
cl-ppcre/specials.lisp
cl-ppcre/util.lisp
cl-ppcre/errors.lisp
cl-ppcre/charset.lisp
cl-ppcre/charmap.lisp
cl-ppcre/chartest.lisp
cl-ppcre/lexer.lisp
cl-ppcre/parser.lisp
cl-ppcre/regex-class.lisp
cl-ppcre/regex-class-util.lisp
cl-ppcre/convert.lisp
cl-ppcre/optimize.lisp
cl-ppcre/closures.lisp
cl-ppcre/repetition-closures.lisp
cl-ppcre/scanner.lisp
cl-ppcre/api.lisp
The main system appears first, followed by any subsystem dependency.
cl-ppcre
Perl-compatible regular expression library
Dr. Edi Weitz
BSD
2.1.1
packages.lisp
(file).
specials.lisp
(file).
util.lisp
(file).
errors.lisp
(file).
charset.lisp
(file).
charmap.lisp
(file).
chartest.lisp
(file).
lexer.lisp
(file).
parser.lisp
(file).
regex-class.lisp
(file).
regex-class-util.lisp
(file).
convert.lisp
(file).
optimize.lisp
(file).
closures.lisp
(file).
repetition-closures.lisp
(file).
scanner.lisp
(file).
api.lisp
(file).
Files are sorted by type and then listed depth-first from the systems components trees.
cl-ppcre/cl-ppcre.asd
cl-ppcre/packages.lisp
cl-ppcre/specials.lisp
cl-ppcre/util.lisp
cl-ppcre/errors.lisp
cl-ppcre/charset.lisp
cl-ppcre/charmap.lisp
cl-ppcre/chartest.lisp
cl-ppcre/lexer.lisp
cl-ppcre/parser.lisp
cl-ppcre/regex-class.lisp
cl-ppcre/regex-class-util.lisp
cl-ppcre/convert.lisp
cl-ppcre/optimize.lisp
cl-ppcre/closures.lisp
cl-ppcre/repetition-closures.lisp
cl-ppcre/scanner.lisp
cl-ppcre/api.lisp
cl-ppcre/specials.lisp
packages.lisp
(file).
cl-ppcre
(system).
*allow-named-registers*
(special variable).
*allow-quoting*
(special variable).
*optimize-char-classes*
(special variable).
*property-resolver*
(special variable).
*regex-char-code-limit*
(special variable).
*use-bmh-matchers*
(special variable).
*end-pos*
(special variable).
*end-string-pos*
(special variable).
*extended-mode-p*
(special variable).
*hyperdoc-base-uri*
(special variable).
*last-pos-stores*
(special variable).
*real-start-pos*
(special variable).
*reg-ends*
(special variable).
*reg-starts*
(special variable).
*regs-maybe-start*
(special variable).
*rep-num*
(special variable).
*repeat-counters*
(special variable).
*special-optimize-settings*
(special variable).
*standard-optimize-settings*
(special variable).
*start-pos*
(special variable).
*string*
(special variable).
*zero-length-num*
(special variable).
hyperdoc-lookup
(function).
cl-ppcre/util.lisp
specials.lisp
(file).
cl-ppcre
(system).
+whitespace-char-string+
(constant).
complement*
(function).
defconstant
(macro).
digit-char-p
(function).
maybe-coerce-to-simple-string
(macro).
normalize-var-list
(function).
nsubseq
(function).
string-list-to-simple-string
(function).
whitespacep
(function).
with-rebinding
(macro).
with-unique-names
(macro).
word-char-p
(function).
cl-ppcre/errors.lisp
util.lisp
(file).
cl-ppcre
(system).
ppcre-error
(condition).
ppcre-invocation-error
(condition).
ppcre-syntax-error
(condition).
ppcre-syntax-error-pos
(reader method).
ppcre-syntax-error-string
(reader method).
*syntax-error-string*
(special variable).
signal-invocation-error
(macro).
signal-syntax-error
(macro).
signal-syntax-error*
(macro).
cl-ppcre/charset.lisp
errors.lisp
(file).
cl-ppcre
(system).
make-load-form
(method).
%add-to-charset
(function).
%add-to-charset/expand
(function).
+probe-depth+
(constant).
add-to-charset
(function).
charset
(structure).
charset-count
(reader).
(setf charset-count)
(writer).
charset-depth
(reader).
(setf charset-depth)
(writer).
charset-p
(function).
charset-vector
(reader).
(setf charset-vector)
(writer).
compute-index
(function).
copy-charset
(function).
create-charset-from-test-function
(function).
in-charset-p
(function).
make-char-vector
(function).
make-charset
(function).
map-charset
(function).
mix
(function).
cl-ppcre/charmap.lisp
charset.lisp
(file).
cl-ppcre
(system).
make-load-form
(method).
charmap
(structure).
charmap-complementp
(reader).
(setf charmap-complementp)
(writer).
charmap-contents
(function).
charmap-count
(reader).
(setf charmap-count)
(writer).
charmap-end
(reader).
(setf charmap-end)
(writer).
charmap-p
(function).
charmap-start
(reader).
(setf charmap-start)
(writer).
charmap-vector
(reader).
(setf charmap-vector)
(writer).
copy-charmap
(function).
create-charmap-from-test-function
(function).
in-charmap-p
(function).
make-charmap
(function).
make-charmap%
(function).
cl-ppcre/chartest.lisp
charmap.lisp
(file).
cl-ppcre
(system).
create-optimized-test-function
(function).
create-hash-table-from-test-function
(function).
cl-ppcre/lexer.lisp
(:not :use-acl-regexp2-engine)
chartest.lisp
(file).
cl-ppcre
(system).
collect-char-class
(function).
copy-lexer
(function).
end-of-string-p
(function).
fail
(function).
get-number
(function).
get-quantifier
(function).
get-token
(function).
lexer
(structure).
lexer-last-pos
(reader).
(setf lexer-last-pos)
(writer).
lexer-len
(reader).
lexer-p
(function).
lexer-pos
(reader).
(setf lexer-pos)
(writer).
lexer-reg
(reader).
(setf lexer-reg)
(writer).
lexer-str
(reader).
looking-at-p
(function).
make-char-from-code
(function).
make-lexer
(function).
make-lexer-internal
(function).
map-char-to-special-char-class
(function).
maybe-parse-flags
(function).
next-char
(function).
next-char-non-extended
(function).
parse-register-name-aux
(function).
read-char-property
(function).
start-of-subexpr-p
(function).
try-number
(function).
unescape-char
(function).
unget-token
(function).
cl-ppcre/parser.lisp
(:not :use-acl-regexp2-engine)
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
parse-string
(function).
greedy-quant
(function).
group
(function).
quant
(function).
reg-expr
(function).
seq
(function).
cl-ppcre/regex-class.lisp
(:not :use-acl-regexp2-engine)
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
initialize-instance
(method).
print-object
(method).
print-object
(method).
print-object
(method).
alternation
(class).
anchor
(class).
back-reference
(class).
branch
(class).
case-insensitive-p
(reader method).
case-insensitive-p
(reader method).
char-class
(class).
choices
(reader method).
(setf choices)
(writer method).
contains-register-p
(reader method).
elements
(reader method).
(setf elements)
(writer method).
else-regex
(reader method).
(setf else-regex)
(writer method).
everything
(class).
filter
(class).
fn
(reader method).
(setf fn)
(writer method).
greedyp
(reader method).
len
(reader method).
len
(reader method).
len
(reader method).
len
(reader method).
(setf len)
(writer method).
(setf len)
(writer method).
lookahead
(class).
lookbehind
(class).
maximum
(reader method).
(setf maximum)
(writer method).
min-len
(reader method).
min-rest
(reader method).
(setf min-rest)
(writer method).
minimum
(reader method).
(setf minimum)
(writer method).
multi-line-p
(reader method).
name
(reader method).
name
(reader method).
(setf name)
(writer method).
negatedp
(reader method).
no-newline-p
(reader method).
num
(reader method).
num
(reader method).
(setf num)
(writer method).
offset
(reader method).
(setf offset)
(writer method).
positivep
(reader method).
positivep
(reader method).
regex
(reader method).
regex
(reader method).
regex
(reader method).
regex
(reader method).
regex
(reader method).
(setf regex)
(writer method).
(setf regex)
(writer method).
(setf regex)
(writer method).
(setf regex)
(writer method).
(setf regex)
(writer method).
regex
(class).
register
(class).
repetition
(class).
seq
(class).
single-line-p
(reader method).
skip
(reader method).
(setf skip)
(writer method).
standalone
(class).
start-of-end-string-p
(reader method).
(setf start-of-end-string-p)
(writer method).
startp
(reader method).
str
(reader method).
(setf str)
(writer method).
str
(class).
test
(reader method).
(setf test)
(writer method).
test-function
(reader method).
then-regex
(reader method).
(setf then-regex)
(writer method).
void
(class).
word-boundary
(class).
cl-ppcre/regex-class-util.lisp
(:not :use-acl-regexp2-engine)
regex-class.lisp
(file).
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
case-mode
(generic function).
compute-offsets
(generic function).
copy-regex
(generic function).
everythingp
(generic function).
len
(method).
regex-length
(generic function).
regex-min-length
(generic function).
remove-registers
(generic function).
skip
(method).
start-of-end-string-p
(method).
str
(method).
cl-ppcre/convert.lisp
(:not :use-acl-regexp2-engine)
regex-class-util.lisp
(file).
regex-class.lisp
(file).
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
case-insensitive-mode-p
(macro).
convert
(function).
convert-aux
(function).
convert-char-class-to-test-function
(function).
convert-compound-parse-tree
(generic function).
convert-simple-parse-tree
(generic function).
maybe-accumulate
(function).
maybe-split-repetition
(function).
multi-line-mode-p
(macro).
resolve-property
(generic function).
set-flag
(function).
single-line-mode-p
(macro).
cl-ppcre/optimize.lisp
(:not :use-acl-regexp2-engine)
convert.lisp
(file).
regex-class-util.lisp
(file).
regex-class.lisp
(file).
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
compute-min-rest
(generic function).
end-string
(function).
end-string-aux
(generic function).
flatten
(generic function).
gather-strings
(generic function).
start-anchored-p
(generic function).
cl-ppcre/closures.lisp
(:not :use-acl-regexp2-engine)
optimize.lisp
(file).
convert.lisp
(file).
regex-class-util.lisp
(file).
regex-class.lisp
(file).
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
*string*-equal
(function).
*string*=
(function).
create-matcher-aux
(generic function).
insert-char-class-tester
(macro).
word-boundary-p
(function).
cl-ppcre/repetition-closures.lisp
(:not :use-acl-regexp2-engine)
closures.lisp
(file).
optimize.lisp
(file).
convert.lisp
(file).
regex-class-util.lisp
(file).
regex-class.lisp
(file).
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
constant-repetition-constant-length-closure
(macro).
create-constant-repetition-constant-length-matcher
(generic function).
create-constant-repetition-matcher
(generic function).
create-greedy-constant-length-matcher
(generic function).
create-greedy-everything-matcher
(function).
create-greedy-matcher
(generic function).
create-greedy-no-zero-matcher
(generic function).
create-matcher-aux
(method).
create-non-greedy-constant-length-matcher
(generic function).
create-non-greedy-matcher
(generic function).
create-non-greedy-no-zero-matcher
(generic function).
greedy-constant-length-closure
(macro).
incf-after
(macro).
non-greedy-constant-length-closure
(macro).
cl-ppcre/scanner.lisp
(:not :use-acl-regexp2-engine)
repetition-closures.lisp
(file).
closures.lisp
(file).
optimize.lisp
(file).
convert.lisp
(file).
regex-class-util.lisp
(file).
regex-class.lisp
(file).
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
bmh-matcher-aux
(macro).
char-searcher-aux
(macro).
create-bmh-matcher
(function).
create-char-searcher
(function).
create-scanner-aux
(function).
insert-advance-fn
(macro).
newline-skipper
(function).
cl-ppcre/api.lisp
scanner.lisp
(file).
repetition-closures.lisp
(file).
closures.lisp
(file).
optimize.lisp
(file).
convert.lisp
(file).
regex-class-util.lisp
(file).
regex-class.lisp
(file).
parser.lisp
(file).
lexer.lisp
(file).
chartest.lisp
(file).
cl-ppcre
(system).
*look-ahead-for-suffix*
(special variable).
all-matches
(compiler macro).
all-matches
(function).
all-matches-as-strings
(compiler macro).
all-matches-as-strings
(function).
count-matches
(compiler macro).
count-matches
(function).
create-scanner
(generic function).
define-parse-tree-synonym
(macro).
do-matches
(macro).
do-matches-as-strings
(macro).
do-register-groups
(macro).
do-scans
(macro).
parse-tree-synonym
(function).
(setf parse-tree-synonym)
(function).
quote-meta-chars
(function).
regex-apropos
(function).
regex-apropos-list
(function).
regex-replace
(compiler macro).
regex-replace
(function).
regex-replace-all
(compiler macro).
regex-replace-all
(function).
register-groups-bind
(macro).
scan
(compiler macro).
scan
(generic function).
scan-to-strings
(compiler macro).
scan-to-strings
(function).
split
(compiler macro).
split
(function).
build-replacement
(function).
build-replacement-template
(generic function).
clean-comments
(function).
print-symbol-info
(function).
quote-sections
(function).
regex-apropos-aux
(macro).
replace-aux
(function).
string-case-modifier
(function).
Packages are listed by definition order.
cl-ppcre
ppcre
common-lisp
.
*allow-named-registers*
(special variable).
*allow-quoting*
(special variable).
*look-ahead-for-suffix*
(special variable).
*optimize-char-classes*
(special variable).
*property-resolver*
(special variable).
*regex-char-code-limit*
(special variable).
*use-bmh-matchers*
(special variable).
all-matches
(compiler macro).
all-matches
(function).
all-matches-as-strings
(compiler macro).
all-matches-as-strings
(function).
count-matches
(compiler macro).
count-matches
(function).
create-optimized-test-function
(function).
create-scanner
(generic function).
define-parse-tree-synonym
(macro).
do-matches
(macro).
do-matches-as-strings
(macro).
do-register-groups
(macro).
do-scans
(macro).
parse-string
(function).
parse-tree-synonym
(function).
(setf parse-tree-synonym)
(function).
ppcre-error
(condition).
ppcre-invocation-error
(condition).
ppcre-syntax-error
(condition).
ppcre-syntax-error-pos
(generic reader).
ppcre-syntax-error-string
(generic reader).
quote-meta-chars
(function).
regex-apropos
(function).
regex-apropos-list
(function).
regex-replace
(compiler macro).
regex-replace
(function).
regex-replace-all
(compiler macro).
regex-replace-all
(function).
register-groups-bind
(macro).
scan
(compiler macro).
scan
(generic function).
scan-to-strings
(compiler macro).
scan-to-strings
(function).
split
(compiler macro).
split
(function).
%add-to-charset
(function).
%add-to-charset/expand
(function).
*end-pos*
(special variable).
*end-string-pos*
(special variable).
*extended-mode-p*
(special variable).
*hyperdoc-base-uri*
(special variable).
*last-pos-stores*
(special variable).
*real-start-pos*
(special variable).
*reg-ends*
(special variable).
*reg-starts*
(special variable).
*regs-maybe-start*
(special variable).
*rep-num*
(special variable).
*repeat-counters*
(special variable).
*special-optimize-settings*
(special variable).
*standard-optimize-settings*
(special variable).
*start-pos*
(special variable).
*string*
(special variable).
*string*-equal
(function).
*string*=
(function).
*syntax-error-string*
(special variable).
*zero-length-num*
(special variable).
+probe-depth+
(constant).
+whitespace-char-string+
(constant).
add-to-charset
(function).
alternation
(class).
anchor
(class).
back-reference
(class).
bmh-matcher-aux
(macro).
branch
(class).
build-replacement
(function).
build-replacement-template
(generic function).
case-insensitive-mode-p
(macro).
case-insensitive-p
(generic reader).
case-mode
(generic function).
char-class
(class).
char-searcher-aux
(macro).
charmap
(structure).
charmap-complementp
(reader).
(setf charmap-complementp)
(writer).
charmap-contents
(function).
charmap-count
(reader).
(setf charmap-count)
(writer).
charmap-end
(reader).
(setf charmap-end)
(writer).
charmap-p
(function).
charmap-start
(reader).
(setf charmap-start)
(writer).
charmap-vector
(reader).
(setf charmap-vector)
(writer).
charset
(structure).
charset-count
(reader).
(setf charset-count)
(writer).
charset-depth
(reader).
(setf charset-depth)
(writer).
charset-p
(function).
charset-vector
(reader).
(setf charset-vector)
(writer).
choices
(generic reader).
(setf choices)
(generic writer).
clean-comments
(function).
collect-char-class
(function).
complement*
(function).
compute-index
(function).
compute-min-rest
(generic function).
compute-offsets
(generic function).
constant-repetition-constant-length-closure
(macro).
contains-register-p
(generic reader).
convert
(function).
convert-aux
(function).
convert-char-class-to-test-function
(function).
convert-compound-parse-tree
(generic function).
convert-simple-parse-tree
(generic function).
copy-charmap
(function).
copy-charset
(function).
copy-lexer
(function).
copy-regex
(generic function).
create-bmh-matcher
(function).
create-char-searcher
(function).
create-charmap-from-test-function
(function).
create-charset-from-test-function
(function).
create-constant-repetition-constant-length-matcher
(generic function).
create-constant-repetition-matcher
(generic function).
create-greedy-constant-length-matcher
(generic function).
create-greedy-everything-matcher
(function).
create-greedy-matcher
(generic function).
create-greedy-no-zero-matcher
(generic function).
create-hash-table-from-test-function
(function).
create-matcher-aux
(generic function).
create-non-greedy-constant-length-matcher
(generic function).
create-non-greedy-matcher
(generic function).
create-non-greedy-no-zero-matcher
(generic function).
create-scanner-aux
(function).
defconstant
(macro).
digit-char-p
(function).
elements
(generic reader).
(setf elements)
(generic writer).
else-regex
(generic reader).
(setf else-regex)
(generic writer).
end-of-string-p
(function).
end-string
(function).
end-string-aux
(generic function).
everything
(class).
everythingp
(generic function).
fail
(function).
filter
(class).
flatten
(generic function).
fn
(generic reader).
(setf fn)
(generic writer).
gather-strings
(generic function).
get-number
(function).
get-quantifier
(function).
get-token
(function).
greedy-constant-length-closure
(macro).
greedy-quant
(function).
greedyp
(generic reader).
group
(function).
hyperdoc-lookup
(function).
in-charmap-p
(function).
in-charset-p
(function).
incf-after
(macro).
insert-advance-fn
(macro).
insert-char-class-tester
(macro).
len
(generic function).
(setf len)
(generic writer).
lexer
(structure).
lexer-last-pos
(reader).
(setf lexer-last-pos)
(writer).
lexer-len
(reader).
lexer-p
(function).
lexer-pos
(reader).
(setf lexer-pos)
(writer).
lexer-reg
(reader).
(setf lexer-reg)
(writer).
lexer-str
(reader).
lookahead
(class).
lookbehind
(class).
looking-at-p
(function).
make-char-from-code
(function).
make-char-vector
(function).
make-charmap
(function).
make-charmap%
(function).
make-charset
(function).
make-lexer
(function).
make-lexer-internal
(function).
map-char-to-special-char-class
(function).
map-charset
(function).
maximum
(generic reader).
(setf maximum)
(generic writer).
maybe-accumulate
(function).
maybe-coerce-to-simple-string
(macro).
maybe-parse-flags
(function).
maybe-split-repetition
(function).
min-len
(generic reader).
min-rest
(generic reader).
(setf min-rest)
(generic writer).
minimum
(generic reader).
(setf minimum)
(generic writer).
mix
(function).
multi-line-mode-p
(macro).
multi-line-p
(generic reader).
name
(generic reader).
(setf name)
(generic writer).
negatedp
(generic reader).
newline-skipper
(function).
next-char
(function).
next-char-non-extended
(function).
no-newline-p
(generic reader).
non-greedy-constant-length-closure
(macro).
normalize-var-list
(function).
nsubseq
(function).
num
(generic reader).
(setf num)
(generic writer).
offset
(generic reader).
(setf offset)
(generic writer).
parse-register-name-aux
(function).
positivep
(generic reader).
print-symbol-info
(function).
quant
(function).
quote-sections
(function).
read-char-property
(function).
reg-expr
(function).
regex
(generic reader).
(setf regex)
(generic writer).
regex
(class).
regex-apropos-aux
(macro).
regex-length
(generic function).
regex-min-length
(generic function).
register
(class).
remove-registers
(generic function).
repetition
(class).
replace-aux
(function).
resolve-property
(generic function).
seq
(function).
seq
(class).
set-flag
(function).
signal-invocation-error
(macro).
signal-syntax-error
(macro).
signal-syntax-error*
(macro).
single-line-mode-p
(macro).
single-line-p
(generic reader).
skip
(generic function).
(setf skip)
(generic writer).
standalone
(class).
start-anchored-p
(generic function).
start-of-end-string-p
(generic function).
(setf start-of-end-string-p)
(generic writer).
start-of-subexpr-p
(function).
startp
(generic reader).
str
(generic function).
(setf str)
(generic writer).
str
(class).
string-case-modifier
(function).
string-list-to-simple-string
(function).
test
(generic reader).
(setf test)
(generic writer).
test-function
(generic reader).
then-regex
(generic reader).
(setf then-regex)
(generic writer).
try-number
(function).
unescape-char
(function).
unget-token
(function).
void
(class).
whitespacep
(function).
with-rebinding
(macro).
with-unique-names
(macro).
word-boundary
(class).
word-boundary-p
(function).
word-char-p
(function).
Definitions are sorted by export status, category, package, and then by lexicographic order.
Whether the parser should support AllegroCL’s named registers (?<name>"<regex>") and back-reference \k<name> syntax.
Whether the parser should support Perl’s \Q and \E.
Controls whether scanners will optimistically look ahead for a constant suffix of a regular expression, if there is one.
Whether character classes should be compiled into look-ups into O(1) data structures. This is usually fast but will be costly in terms of scanner creation time and might be costly in terms of size if *REGEX-CHAR-CODE-LIMIT* is high. This value will be used as the :KIND keyword argument to CREATE-OPTIMIZED-TEST-FUNCTION - see there for the possible non-NIL values.
Should be NIL or a designator for a function which accepts strings and returns unary character test functions or NIL. This ’resolver’ is intended to handle ‘character properties’ like \p{IsAlpha}. If *PROPERTY-RESOLVER* is NIL, then the parser will simply treat \p and \P as #\p and #\P as in older versions of CL-PPCRE.
The upper exclusive bound on the char-codes of characters which can occur in character classes. Change this value BEFORE creating scanners if you don’t need the (full) Unicode support of implementations like AllegroCL, CLISP, LispWorks, or SBCL.
Whether the scanners created by CREATE-SCANNER should use the (fast but large) Boyer-Moore-Horspool matchers.
Defines the symbol NAME to be a synonym for the parse tree PARSE-TREE. Both arguments are quoted.
Iterates over TARGET-STRING and tries to match REGEX as often as possible evaluating BODY with MATCH-START and MATCH-END bound to the start/end positions of each match in turn. After the last match, returns RESULT-FORM if provided or NIL otherwise. An implicit block named NIL surrounds DO-MATCHES; RETURN may be used to terminate the loop immediately. If REGEX matches an empty string the scan is continued one position behind this match. BODY may start with declarations.
Iterates over TARGET-STRING and tries to match REGEX as often as possible evaluating BODY with MATCH-VAR bound to the substring of TARGET-STRING corresponding to each match in turn. After the last match, returns RESULT-FORM if provided or NIL otherwise. An implicit block named NIL surrounds DO-MATCHES-AS-STRINGS; RETURN may be used to terminate the loop immediately. If REGEX matches an empty string the scan is continued one position behind this match. If SHAREDP is true, the substrings may share structure with TARGET-STRING. BODY may start with declarations.
Iterates over TARGET-STRING and tries to match REGEX as often as possible evaluating BODY with the variables in VAR-LIST bound to the corresponding register groups for each match in turn, i.e. each variable is either bound to a string or to NIL. For each element of VAR-LIST which is NIL there’s no binding to the corresponding register group. The number of variables in VAR-LIST must not be greater than the number of register groups. After the last match, returns RESULT-FORM if provided or NIL otherwise. An implicit block named NIL surrounds DO-REGISTER-GROUPS; RETURN may be used to terminate the loop immediately. If REGEX matches an empty string the scan is continued one position behind this match. If SHAREDP is true, the substrings may share structure with TARGET-STRING. BODY may start with declarations.
Iterates over TARGET-STRING and tries to match REGEX as often as possible evaluating BODY with MATCH-START, MATCH-END, REG-STARTS, and REG-ENDS bound to the four return values of each match in turn. After the last match, returns RESULT-FORM if provided or NIL otherwise. An implicit block named NIL surrounds DO-SCANS; RETURN may be used to terminate the loop immediately. If REGEX matches an empty string the scan is continued one position behind this match. BODY may start with declarations.
Executes BODY with the variables in VAR-LIST bound to the corresponding register groups after TARGET-STRING has been matched against REGEX, i.e. each variable is either bound to a string or to NIL. If there is no match, BODY is _not_ executed. For each element of VAR-LIST which is NIL there’s no binding to the corresponding register group. The number of variables in VAR-LIST must not be greater than the number of register groups. If SHAREDP is true, the substrings may share structure with TARGET-STRING.
Make sure that constant forms are compiled into scanners at compile time.
Make sure that constant forms are compiled into scanners at compile time.
Make sure that constant forms are compiled into scanners at compile time.
Make sure that constant forms are compiled into scanners at compile time.
Make sure that constant forms are compiled into scanners at compile time.
Make sure that constant forms are compiled into scanners at compile time.
Make sure that constant forms are compiled into scanners at compile time.
Make sure that constant forms are compiled into scanners at compile time.
Returns a list containing the start and end positions of all matches of REGEX against TARGET-STRING, i.e. if there are N matches the list contains (* 2 N) elements. If REGEX matches an empty string the scan is continued one position behind this match.
Returns a list containing all substrings of TARGET-STRING which match REGEX. If REGEX matches an empty string the scan is continued one position behind this match. If SHAREDP is true, the substrings may share structure with TARGET-STRING.
Returns a count of all substrings of TARGET-STRING which match REGEX.
Given a unary test function which is applicable to characters
returns a function which yields the same boolean results for all
characters with character codes from START to (excluding) END. If
KIND is NIL, TEST-FUNCTION will simply be returned. Otherwise, KIND
should be one of:
* :HASH-TABLE - builds a hash table representing all characters which
satisfy the test and returns a closure which checks if
a character is in that hash table
* :CHARSET - instead of a hash table uses a "charset" which is a data structure using non-linear hashing and optimized to represent (sparse) sets of characters in a fast and space-efficient way (contributed by Nikodemus Siivola)
* :CHARMAP - instead of a hash table uses a bit vector to represent
the set of characters
You can also use :HASH-TABLE* or :CHARSET* which are like :HASH-TABLE and :CHARSET but use the complement of the set if the set contains more than half of all characters between START and END. This saves space but needs an additional pass across all characters to create the data structure. There is no corresponding :CHARMAP* kind as the bit vectors are already created to cover the smallest possible interval which contains either the set or its complement.
Translate the regex string STRING into a parse tree.
Returns the parse tree the SYMBOL symbol is a synonym for. Returns NIL is SYMBOL wasn’t yet defined to be a synonym.
Defines SYMBOL to be a synonm for the parse tree NEW-PARSE-TREE.
Quote, i.e. prefix with #\\, all non-word characters in STRING.
Similar to the standard function APROPOS but returns a list of all symbols which match the regular expression REGEX. If CASE-INSENSITIVE is true and REGEX isn’t already a scanner, a case-insensitive scanner is used.
Similar to the standard function APROPOS-LIST but returns a list of all symbols which match the regular expression REGEX. If CASE-INSENSITIVE is true and REGEX isn’t already a scanner, a case-insensitive scanner is used.
Try to match TARGET-STRING between START and END against REGEX and
replace the first match with REPLACEMENT. Two values are returned;
the modified string, and T if REGEX matched or NIL otherwise.
REPLACEMENT can be a string which may contain the special substrings
"\&" for the whole match, "\‘" for the part of TARGET-STRING
before the match, "\’" for the part of TARGET-STRING after the
match, "\N" or "\{N}" for the Nth register where N is a positive
integer.
REPLACEMENT can also be a function designator in which case the
match will be replaced with the result of calling the function
designated by REPLACEMENT with the arguments TARGET-STRING, START,
END, MATCH-START, MATCH-END, REG-STARTS, and REG-ENDS. (REG-STARTS and
REG-ENDS are arrays holding the start and end positions of matched
registers or NIL - the meaning of the other arguments should be
obvious.)
Finally, REPLACEMENT can be a list where each element is a string,
one of the symbols :MATCH, :BEFORE-MATCH, or :AFTER-MATCH -
corresponding to "\&", "\‘", and "\’" above -, an integer N -
representing register (1+ N) -, or a function designator.
If PRESERVE-CASE is true, the replacement will try to preserve the
case (all upper case, all lower case, or capitalized) of the
match. The result will always be a fresh string, even if REGEX doesn’t
match.
ELEMENT-TYPE is the element type of the resulting string.
Try to match TARGET-STRING between START and END against REGEX and
replace all matches with REPLACEMENT. Two values are returned; the
modified string, and T if REGEX matched or NIL otherwise.
REPLACEMENT can be a string which may contain the special substrings
"\&" for the whole match, "\‘" for the part of TARGET-STRING
before the match, "\’" for the part of TARGET-STRING after the
match, "\N" or "\{N}" for the Nth register where N is a positive
integer.
REPLACEMENT can also be a function designator in which case the
match will be replaced with the result of calling the function
designated by REPLACEMENT with the arguments TARGET-STRING, START,
END, MATCH-START, MATCH-END, REG-STARTS, and REG-ENDS. (REG-STARTS and
REG-ENDS are arrays holding the start and end positions of matched
registers or NIL - the meaning of the other arguments should be
obvious.)
Finally, REPLACEMENT can be a list where each element is a string,
one of the symbols :MATCH, :BEFORE-MATCH, or :AFTER-MATCH -
corresponding to "\&", "\‘", and "\’" above -, an integer N -
representing register (1+ N) -, or a function designator.
If PRESERVE-CASE is true, the replacement will try to preserve the
case (all upper case, all lower case, or capitalized) of the
match. The result will always be a fresh string, even if REGEX doesn’t
match.
ELEMENT-TYPE is the element type of the resulting string.
Like SCAN but returns substrings of TARGET-STRING instead of positions, i.e. this function returns two values on success: the whole match as a string plus an array of substrings (or NILs) corresponding to the matched registers. If SHAREDP is true, the substrings may share structure with TARGET-STRING.
Matches REGEX against TARGET-STRING as often as possible and returns a list of the substrings between the matches. If WITH-REGISTERS-P is true, substrings corresponding to matched registers are inserted into the list as well. If OMIT-UNMATCHED-P is true, unmatched registers will simply be left out, otherwise they will show up as NIL. LIMIT limits the number of elements returned - registers aren’t counted. If LIMIT is NIL (or 0 which is equivalent), trailing empty strings are removed from the result list. If REGEX matches an empty string the scan is continued one position behind this match. If SHAREDP is true, the substrings may share structure with TARGET-STRING.
Accepts a regular expression - either as a
parse-tree or as a string - and returns a scan closure which will scan
strings for this regular expression and a list mapping registers to
their names (NIL stands for unnamed ones). The "mode" keyword
arguments are equivalent to the imsx modifiers in Perl. If
DESTRUCTIVE is not NIL, the function is allowed to destructively
modify its first argument (but only if it’s a parse tree).
function
) &key case-insensitive-mode multi-line-mode single-line-mode extended-mode destructive) ¶string
) &key case-insensitive-mode multi-line-mode single-line-mode extended-mode destructive) ¶Returns the position within the string where the error occurred (or NIL if the error happened while trying to convert a parse tree
ppcre-syntax-error
)) ¶pos
.
Returns the string the parser was parsing when the error was encountered (or NIL if the error happened while trying to convert a parse tree).
ppcre-syntax-error
)) ¶Searches TARGET-STRING from START to END and tries
to match REGEX. On success returns four values - the start of the
match, the end of the match, and two arrays denoting the beginnings
and ends of register matches. On failure returns NIL. REGEX can be a
string which will be parsed according to Perl syntax, a parse tree, or
a pre-compiled scanner created by CREATE-SCANNER. TARGET-STRING will
be coerced to a simple string if it isn’t one already. The
REAL-START-POS parameter should be ignored - it exists only for
internal purposes.
str
) &rest init-args) ¶Automatically computes the length of a STR after initialization.
repetition
) stream) ¶All errors signaled by CL-PPCRE are of this type.
simple-error
.
Signaled when CL-PPCRE functions are invoked with wrong arguments.
Signaled if CL-PPCRE’s parser encounters an error
when trying to parse a regex string or to convert a parse tree into
its internal representation.
Maximum number of collisions (for any element) we accept before we allocate more storage. This is now fixed, but could be made to vary depending on the size of the storage vector (e.g. in the range of 1-4). Larger probe-depths mean more collisions are tolerated before the table grows, but increase the constant factor.
A string of all characters which are considered to be whitespace. Same as Perl’s [\s].
Where to stop scanning within *STRING*.
Start of the next possible end-string candidate.
Whether the parser will start in extended mode.
An array to keep track of the last positions
where we saw repetitive patterns.
Only used for patterns which might have zero length.
The real start of *STRING*. This is for repeated scans and is only used internally.
An array which holds the end positions of the current register candidates.
An array which holds the start positions of the current register candidates.
An array which holds the next start positions of the current register candidates.
Counts the number of "complicated" repetitions while the matchers are built.
An array to keep track of how often repetitive patterns have been tested already.
Special optimize settings used only by a few declaration expressions.
The standard optimize settings used by most declaration expressions.
Where to start scanning within *STRING*.
The string which is currently scanned by SCAN. Will always be coerced to a SIMPLE-STRING.
The string which caused the syntax error.
Counts the number of repetitions the inner regexes of which may have zero-length while the matchers are built.
Auxiliary macro used by CREATE-BMH-MATCHER.
Accessor macro to extract the first flag out of a three-element flag list.
Auxiliary macro used by CREATE-CHAR-SEARCHER.
This is the template for simple constant repetitions (where simple means that the inner regex to be checked is of fixed length LEN, and that it doesn’t contain registers, i.e. there’s no need for backtracking) and where constant means that MINIMUM is equal to MAXIMUM. CHECK-CURR-POS is a form which checks whether the inner regex of the repetition matches at CURR-POS.
Make sure VALUE is evaluated only once (to appease SBCL).
This is the template for simple greedy repetitions (where simple means that the minimum number of repetitions is zero, that the inner regex to be checked is of fixed length LEN, and that it doesn’t contain registers, i.e. there’s no need for backtracking). CHECK-CURR-POS is a form which checks whether the inner regex of the repetition matches at CURR-POS.
Utility macro inspired by C’s "place++", i.e. first return the value of PLACE and afterwards increment it by DELTA.
Creates the actual closure returned by CREATE-SCANNER-AUX by replacing ’(ADVANCE-FN-DEFINITION) with a suitable definition for ADVANCE-FN. This is a utility macro used by CREATE-SCANNER-AUX.
Utility macro to replace each occurence of ’(CHAR-CLASS-TEST) within BODY with the correct test (corresponding to CHAR-CLASS) against CHR-EXPR.
Coerces STRING to a simple STRING unless it already is one.
Accessor macro to extract the second flag out of a three-element flag list.
This is the template for simple non-greedy repetitions (where simple means that the minimum number of repetitions is zero, that the inner regex to be checked is of fixed length LEN, and that it doesn’t contain registers, i.e. there’s no need for backtracking). CHECK-CURR-POS is a form which checks whether the inner regex of the repetition matches at CURR-POS.
Auxiliary macro used by REGEX-APROPOS and REGEX-APROPOS-LIST. Loops through PACKAGES and executes BODY with SYMBOL bound to each symbol which matches REGEX. Optionally evaluates and returns RETURN-FORM at the end. If CASE-INSENSITIVE is true and REGEX isn’t already a scanner, a case-insensitive scanner is used.
Accessor macro to extract the third flag out of a three-element flag list.
WITH-REBINDING ( { var | (var prefix) }* ) form*
Evaluates a series of forms in the lexical environment that is formed by adding the binding of each VAR to a fresh, uninterned symbol, and the binding of that fresh, uninterned symbol to VAR’s original value, i.e., its value in the current lexical environment.
The uninterned symbol is created as if by a call to GENSYM with the
string denoted by PREFIX - or, if PREFIX is not supplied, the string
denoted by VAR - as argument.
The forms are evaluated in order, and the values of all but the last are discarded (that is, the body is an implicit PROGN).
Syntax: WITH-UNIQUE-NAMES ( { var | (var x) }* ) declaration* form*
Executes a series of forms with each VAR bound to a fresh,
uninterned symbol. The uninterned symbol is as if returned by a call
to GENSYM with the string denoted by X - or, if X is not supplied, the
string denoted by VAR - as argument.
The variable bindings created are lexical unless special declarations
are specified. The scopes of the name bindings and declarations do not
include the Xs.
The forms are evaluated in order, and the values of all but the last are discarded (that is, the body is an implicit PROGN).
Tries to add the character CHAR to the charset SET without extending it. Returns NIL if this fails. Counts CHAR as new if COUNT is true and it is added to SET.
Extends the charset SET and then adds the character CHAR to it.
Like STRING-EQUAL, i.e. compares the special string *STRING* from START1 to END1 with STRING2 from START2 to END2. Note that there’s no boundary check - this has to be implemented by the caller.
Like STRING=, i.e. compares the special string *STRING* from START1 to END1 with STRING2 from START2 to END2. Note that there’s no boundary check - this has to be implemented by the caller.
Adds the character CHAR to the charset SET, extending SET if necessary. Returns CHAR.
Accepts a replacement template and the current values from the matching process in REGEX-REPLACE or REGEX-REPLACE-ALL and returns the corresponding string.
Returns a list of all characters belonging to a character map. Only works for non-complement charmaps.
end
.
Clean (?#...) comments within STRING for quoting, i.e. convert \Q to Q and \E to E. If EXTENDED-MODE is true, also clean end-of-line comments, i.e. those starting with #\# and ending with #\Newline.
Reads and consumes characters from regex string until a right bracket is seen. Assembles them into a list (which is returned) of characters, character ranges, like (:RANGE #\A #\E) for a-e, and tokens representing special character classes.
Like COMPLEMENT but optimized for unary functions.
Computes and returns the index into the vector VECTOR corresponding to the hash code HASH.
Converts the parse tree PARSE-TREE into an equivalent REGEX object and returns three values: the REGEX object, the number of registers seen and an object the regex starts with which is either a STR object or an EVERYTHING object (if the regex starts with something like ".*") or NIL.
Converts the parse tree PARSE-TREE into a REGEX object and returns
it. Will also
- split and optimize repetitions,
- accumulate strings or EVERYTHING objects into the special variable
STARTS-WITH,
- keep track of all registers seen in the special variable REG-NUM,
- keep track of all named registers seen in the special variable REG-NAMES
- keep track of the highest backreference seen in the special
variable MAX-BACK-REF,
- maintain and adher to the currently applicable modifiers in the special
variable FLAGS, and
- maybe even wash your car...
Combines all items in LIST into test function and returns a logical-OR combination of these functions. Items can be single characters, character ranges like (:RANGE #\A #\E), or special character classes like :DIGIT-CLASS. Does the right thing with respect to case-(in)sensitivity as specified by the special variable FLAGS.
Returns a Boyer-Moore-Horspool matcher which searches the (special) simple-string *STRING* for the first occurence of the substring PATTERN. The search starts at the position START-POS within *STRING* and stops before *END-POS* is reached. Depending on the second argument the search is case-insensitive or not. If the special variable *USE-BMH-MATCHERS* is NIL, use the standard SEARCH function instead. (BMH matchers are faster but need much more space.)
Returns a function which searches the (special) simple-string *STRING* for the first occurence of the character CHR. The search starts at the position START-POS within *STRING* and stops before *END-POS* is reached. Depending on the second argument the search is case-insensitive or not.
Creates and returns a charmap representing all characters with character codes between START and END which satisfy TEST-FUNCTION. Tries to find the smallest interval which is necessary to represent the character set and uses the complement representation if that helps.
Creates and returns a charset representing all characters with character codes between START and END which satisfy TEST-FUNCTION.
Creates a closure which just matches as far ahead as possible, i.e. a closure for a dot in single-line mode.
Creates and returns a hash table representing all characters with character codes between START and END which satisfy TEST-FUNCTION.
Auxiliary function to create and return a scanner (which is actually a closure). Used by CREATE-SCANNER.
Tests whether a character is a decimal digit, i.e. the same as Perl’s [\d]. Note that this function shadows the standard Common Lisp function CL:DIGIT-CHAR-P.
Tests whether we’re at the end of the regex string.
Returns the constant string (if it exists) REGEX ends with wrapped into a STR object, otherwise NIL.
Moves (LEXER-POS LEXER) back to the last position stored in (LEXER-LAST-POS LEXER) and pops the LAST-POS stack.
Read and consume the number the lexer is currently looking at and
return it. Returns NIL if no number could be identified.
RADIX is used as in PARSE-INTEGER. If MAX-LENGTH is not NIL we’ll read
at most the next MAX-LENGTH characters. If NO-WHITESPACE-P is not NIL
we don’t tolerate whitespace in front of the number.
Returns a list of two values (min max) if what the lexer is looking at can be interpreted as a quantifier. Otherwise returns NIL and resets the lexer to its old position.
Returns and consumes the next token from the regex string (or NIL).
Parses and consumes a <greedy-quant>.
The productions are: <greedy-quant> -> <group> | <group><quantifier>
where <quantifier> is parsed by the lexer function GET-QUANTIFIER.
Will return <parse-tree> or (:GREEDY-REPETITION <min> <max> <parse-tree>).
Parses and consumes a <group>.
The productions are: <group> -> "("<regex>")"
"(?:"<regex>")"
"(?>"<regex>")"
"(?<flags>:"<regex>")"
"(?="<regex>")"
"(?!"<regex>")"
"(?<="<regex>")"
"(?<!"<regex>")"
"(?("<num>")"<regex>")"
"(?("<regex>")"<regex>")"
"(?<name>"<regex>")" (when *ALLOW-NAMED-REGISTERS* is T)
<legal-token>
where <flags> is parsed by the lexer function MAYBE-PARSE-FLAGS.
Will return <parse-tree> or (<grouping-type> <parse-tree>) where
<grouping-type> is one of six keywords - see source for details.
Tests whether the character CHAR belongs to the set represented by CHARMAP.
Checks whether the character CHAR is in the charset SET.
pos
.
reg
.
Tests whether the next character the lexer would see is CHR. Does not respect extended mode.
Create character from char-code NUMBER. NUMBER can be NIL which is interpreted as 0. ERROR-POS is the position where the corresponding number started within the regex string.
Returns a vector of size SIZE to hold characters. All elements are initialized to #Null except for the first one which is initialized to #?.
Creates and returns a charmap representing all characters with character codes in the interval [start end) that satisfy TEST-FUNCTION. The COMPLEMENTP slot of the charmap is set to the value of the optional argument, but this argument doesn’t have an effect on how TEST-FUNCTION is used.
Maps escaped characters like "\d" to the tokens which represent their associated character classes.
Calls FUNCTION with all characters in SET. Returns NIL.
Accumulate STR into the special variable STARTS-WITH if ACCUMULATE-START-P (also special) is true and STARTS-WITH is either NIL or a STR object of the same case mode. Always returns NIL.
Reads a sequence of modifiers (including #\- to reverse their meaning) and returns a corresponding list of "flag" tokens. The "x" modifier is treated specially in that it dynamically modifies the behaviour of the lexer itself via the special variable *EXTENDED-MODE-P*.
Splits a REPETITION object into a constant and a varying part if
applicable, i.e. something like
a{3,} -> a{3}a*
The arguments to this function correspond to the REPETITION slots of
the same name.
Given a character code CODE and a hash code HASH, computes and returns the "next" hash code. See comments below.
Finds the next occurence of a character in *STRING* which is behind a #Newline.
Returns the next character which is to be examined and updates the POS slot. Respects extended mode, i.e. whitespace, comments, and also nested comments are skipped if applicable.
Returns the next character which is to be examined and updates the POS slot. Does not respect extended mode.
Utility function for REGISTER-GROUPS-BIND and DO-REGISTER-GROUPS. Creates the long form (a list of (FUNCTION VAR) entries) out of the short form of VAR-LIST.
Returns a subsequence by pointing to location in original sequence.
Reads and returns the name in a named register group. It is assumed that the starting #< character has already been read. The closing #> will also be consumed.
Auxiliary function used by REGEX-APROPOS. Tries to print some meaningful information about a symbol.
Parses and consumes a <quant>.
The productions are: <quant> -> <greedy-quant> | <greedy-quant>"?".
Will return the <parse-tree> returned by GREEDY-QUANT and optionally
change :GREEDY-REPETITION to :NON-GREEDY-REPETITION.
Replace sections inside of STRING which are enclosed by \Q and \E with the quoted equivalent of these sections (see QUOTE-META-CHARS). Repeat this as long as there are such sections. These sections may nest.
Parses and consumes a <regex>, a complete regular expression.
The productions are: <regex> -> <seq> | <seq>"|"<regex>.
Will return <parse-tree> or (:ALTERNATION <parse-tree> <parse-tree>).
Auxiliary function used by REGEX-REPLACE and REGEX-REPLACE-ALL. POS-LIST contains a list with the start and end positions of all matches while REG-LIST contains a list of arrays representing the corresponding register start and end positions.
Parses and consumes a <seq>.
The productions are: <seq> -> <quant> | <quant><seq>.
Will return <parse-tree> or (:SEQUENCE <parse-tree> <parse-tree>).
Reads a flag token and sets or unsets the corresponding entry in the special FLAGS list.
Tests whether the next token can start a valid sub-expression, i.e. a stand-alone regex.
Checks whether all words in STR between FROM and TO are upcased, downcased or capitalized and returns a function which applies a corresponding case modification to strings. Returns #’IDENTITY otherwise, especially if words in the target area extend beyond FROM or TO. STR is supposed to be bounded by START and END. It is assumed that (<= START FROM TO END).
Concatenates a list of strings to one simple-string.
Like GET-NUMBER but won’t consume anything if no number is seen.
Convert the characters(s) following a backslash into a token which is returned. This function is to be called when the backslash has already been consumed. Special character classes like \W are handled elsewhere.
Moves the lexer back to the last position stored in the LAST-POS stack.
Tests whether a character is whitespace, i.e. whether it would match [\s] in Perl.
Check whether START-POS is a word-boundary within *STRING*.
Tests whether a character is a "word" character. In the ASCII charset this is equivalent to a-z, A-Z, 0-9, or _, i.e. the same as Perl’s [\w].
Converts a replacement string for REGEX-REPLACE or REGEX-REPLACE-ALL into a replacement template which is an S-expression.
back-reference
)) ¶Whether we check case-insensitively.
Utility function used by the optimizer (see GATHER-STRINGS). Returns a keyword denoting the case-(in)sensitivity of a STR or its second argument if the STR has length 0. Returns NIL for REGEX objects which are not of type STR.
alternation
)) ¶alternation
)) ¶A list of REGEX objects
Returns the minimal length of REGEX plus
CURRENT-MIN-REST. This is similar to REGEX-MIN-LENGTH except that it
recurses down into REGEX and sets the MIN-REST slots of REPETITION
objects.
lookbehind
) current-min-rest) ¶standalone
) current-min-rest) ¶repetition
) current-min-rest) ¶alternation
) current-min-rest) ¶Returns the offset the following regex would have
relative to START-POS or NIL if we can’t compute it. Sets the OFFSET
slot of REGEX to START-POS if REGEX is a STR. May also affect OFFSET
slots of STR objects further down the tree.
back-reference
) start-pos) ¶everything
) start-pos) ¶char-class
) start-pos) ¶standalone
) start-pos) ¶repetition
) start-pos) ¶alternation
) start-pos) ¶repetition
)) ¶Whether the regex contains a register.
Helper function for CONVERT-AUX which converts
parse trees which are conses and dispatches on TOKEN which is the
first element of the parse tree.
(eql :flags)
) parse-tree &key) ¶The case for (:FLAGS {<flag>}*) where flag is a modifier symbol like :CASE-INSENSITIVE-P.
(eql :inverted-property)
) parse-tree &key) ¶The case for (:INVERTED-PROPERTY <name>) where <name> is a string.
(eql :property)
) parse-tree &key) ¶The case for (:PROPERTY <name>) where <name> is a string.
(eql :inverted-char-class)
) parse-tree &key) ¶The case for (:INVERTED-CHAR-CLASS {<item>}*).
(eql :char-class)
) parse-tree &key invertedp) ¶The case for (:CHAR-CLASS {<item>}*) where item is one of
- a character,
- a character range: (:RANGE <char1> <char2>), or
- a special char class symbol like :DIGIT-CHAR-CLASS.
Also used for inverted char classes when INVERTEDP is true.
(eql :regex)
) parse-tree &key) ¶The case for (:REGEX <string>).
(eql :back-reference)
) parse-tree &key) ¶The case for (:BACK-REFERENCE <number>|<name>).
(eql :standalone)
) parse-tree &key) ¶The case for (:STANDALONE <regex>).
(eql :filter)
) parse-tree &key) ¶The case for (:FILTER <function> &optional <length>).
(eql :named-register)
) parse-tree &key) ¶The case for (:NAMED-REGISTER <regex>).
(eql :register)
) parse-tree &key name) ¶The case for (:REGISTER <regex>). Also used for named registers when NAME is not NIL.
(eql :non-greedy-repetition)
) parse-tree &key) ¶The case for (:NON-GREEDY-REPETITION <min> <max> <regex>).
(eql :greedy-repetition)
) parse-tree &key greedyp) ¶The case for (:GREEDY-REPETITION|:NON-GREEDY-REPETITION <min> <max> <regex>).
This function is also used for the non-greedy case in which case it is called with GREEDYP set to NIL as you would expect.
(eql :negative-lookbehind)
) parse-tree &key) ¶The case for (:NEGATIVE-LOOKBEHIND <regex>).
(eql :positive-lookbehind)
) parse-tree &key) ¶The case for (:POSITIVE-LOOKBEHIND <regex>).
(eql :negative-lookahead)
) parse-tree &key) ¶The case for (:NEGATIVE-LOOKAHEAD <regex>).
(eql :positive-lookahead)
) parse-tree &key) ¶The case for (:POSITIVE-LOOKAHEAD <regex>).
(eql :branch)
) parse-tree &key) ¶The case for (:BRANCH <test> <regex>).
Here, <test> must be look-ahead, look-behind or number; if <regex> is an alternation it must have one or two choices.
(eql :alternation)
) parse-tree &key) ¶The case for (:ALTERNATION {<regex>}*).
(eql :group)
) parse-tree &key) ¶The case for parse trees like (:GROUP {<regex>}*).
This is a syntactical construct equivalent to :SEQUENCE intended to keep the effect of modifiers local.
(eql :sequence)
) parse-tree &key) ¶The case for parse trees like (:SEQUENCE {<regex>}*).
Helper function for CONVERT-AUX which converts parse trees which are atoms.
The default method - check if there’s a translation.
character
)) ¶string
)) ¶(eql :void)
)) ¶(eql :word-boundary)
)) ¶(eql :non-word-boundary)
)) ¶(eql :everything)
)) ¶(eql :digit-class)
)) ¶(eql :word-char-class)
)) ¶(eql :whitespace-char-class)
)) ¶(eql :non-digit-class)
)) ¶(eql :non-word-char-class)
)) ¶(eql :non-whitespace-char-class)
)) ¶(eql :start-anchor)
)) ¶(eql :end-anchor)
)) ¶(eql :modeless-start-anchor)
)) ¶(eql :modeless-end-anchor)
)) ¶(eql :modeless-end-anchor-no-newline)
)) ¶(eql :case-insensitive-p)
)) ¶(eql :case-sensitive-p)
)) ¶(eql :multi-line-mode-p)
)) ¶(eql :not-multi-line-mode-p)
)) ¶(eql :single-line-mode-p)
)) ¶(eql :not-single-line-mode-p)
)) ¶Implements a deep copy of a REGEX object.
char-class
)) ¶back-reference
)) ¶standalone
)) ¶repetition
)) ¶lookbehind
)) ¶alternation
)) ¶word-boundary
)) ¶everything
)) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION has a constant number of repetitions.
It is furthermore assumed that the inner regex of REPETITION is of
fixed length and doesn’t contain registers.
repetition
) next-fn) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION has a constant number of repetitions.
repetition
) next-fn) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION is greedy and the minimal number of
repetitions is zero. It is furthermore assumed that the inner regex
of REPETITION is of fixed length and doesn’t contain registers.
repetition
) next-fn) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION is greedy and the minimal number of
repetitions is zero.
repetition
) next-fn) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION is greedy and the minimal number of
repetitions is zero. It is furthermore assumed that the inner regex
of REPETITION can never match a zero-length string (or instead the
maximal number of repetitions is 1).
repetition
) next-fn) ¶Creates a closure which takes one parameter,
START-POS, and tests whether REGEX can match *STRING* at START-POS
such that the call to NEXT-FN after the match would succeed.
repetition
) next-fn) ¶standalone
) next-fn) ¶back-reference
) next-fn) ¶everything
) next-fn) ¶word-boundary
) next-fn) ¶char-class
) next-fn) ¶lookbehind
) next-fn) ¶alternation
) next-fn) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION is non-greedy and the minimal number of
repetitions is zero. It is furthermore assumed that the inner regex
of REPETITION is of fixed length and doesn’t contain registers.
repetition
) next-fn) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION is non-greedy and the minimal number of
repetitions is zero.
repetition
) next-fn) ¶Creates a closure which tries to match REPETITION.
It is assumed that REPETITION is non-greedy and the minimal number of
repetitions is zero. It is furthermore assumed that the inner regex
of REPETITION can never match a zero-length string (or instead the
maximal number of repetitions is 1).
repetition
) next-fn) ¶Returns the constant string (if it exists) REGEX
ends with wrapped into a STR object, otherwise NIL.
OLD-CASE-INSENSITIVE-P is the CASE-INSENSITIVE-P slot of the last STR
collected or :VOID if no STR has been collected yet. (This is a helper
function called by END-STRING.)
standalone
) &optional old-case-insensitive-p) ¶Returns an EVERYTHING object if REGEX is equivalent
to this object, otherwise NIL. So, "(.){1}" would return true
(i.e. the object corresponding to ".", for example.
everything
)) ¶standalone
)) ¶repetition
)) ¶alternation
)) ¶Merges adjacent sequences and alternations, i.e. it transforms #<SEQ #<STR "a"> #<SEQ #<STR "b"> #<STR "c">>> to #<SEQ #<STR "a"> #<STR "b"> #<STR "c">>. This is a destructive operation on REGEX.
alternation
)) ¶Collects adjacent strings or characters into one
string provided they have the same case mode. This is a destructive
operation on REGEX.
alternation
)) ¶repetition
)) ¶Whether the repetition is greedy.
filter
)) ¶The fixed length of this filter or NIL.
len
.
repetition
)) ¶The length of the enclosed regex. NIL if unknown.
len
.
lookbehind
)) ¶The (fixed) length of the enclosed regex.
len
.
repetition
)) ¶repetition
)) ¶The maximal number of repetitions. Can be NIL for unbounded.
repetition
)) ¶The minimal length of the enclosed regex.
repetition
)) ¶repetition
)) ¶The minimal number of characters which must appear after this repetition.
repetition
)) ¶repetition
)) ¶The minimal number of repetitions.
back-reference
)) ¶back-reference
)) ¶The name of the register this reference refers to or NIL.
name
.
word-boundary
)) ¶Whether we mean the opposite, i.e. no word-boundary.
back-reference
)) ¶back-reference
)) ¶The number of the register this reference refers to.
num
.
lookbehind
)) ¶Whether this assertion is positive.
standalone
)) ¶standalone
)) ¶The inner regex.
register
)) ¶register
)) ¶The inner regex.
repetition
)) ¶repetition
)) ¶The REGEX that’s repeated.
lookbehind
)) ¶lookbehind
)) ¶The REGEX object we’re checking.
Return the length of REGEX if it is fixed, NIL otherwise.
everything
)) ¶char-class
)) ¶back-reference
)) ¶standalone
)) ¶repetition
)) ¶alternation
)) ¶Returns the minimal length of REGEX.
everything
)) ¶char-class
)) ¶standalone
)) ¶repetition
)) ¶alternation
)) ¶Returns a deep copy of a REGEX (see COPY-REGEX) and
optionally removes embedded REGISTER objects if possible and if the
special variable REMOVE-REGISTERS-P is true.
alternation
)) ¶lookbehind
)) ¶standalone
)) ¶repetition
)) ¶Resolves PROPERTY to a unary character test
function. PROPERTY can either be a function designator or it can be a
string which is resolved using *PROPERTY-RESOLVER*.
everything
)) ¶Whether we’re in single-line mode, i.e. whether we also match #\Newline.
Returns T if REGEX starts with a "real" start
anchor, i.e. one that’s not in multi-line mode, NIL otherwise. If
IN-SEQ-P is true the function will return :ZERO-LENGTH if REGEX is a
zero-length assertion.
standalone
) &optional in-seq-p) ¶repetition
) &optional in-seq-p) ¶alternation
) &optional in-seq-p) ¶char-class
)) ¶A unary function (accepting a
character) which stands in for the character class and does the work
of checking whether a character belongs to the class.
structure-object
.
common-lisp
.
simple-bit-vector
#*0
fixnum
0
fixnum
0
common-lisp
.
(or fixnum null)
boolean
structure-object
.
fixnum
cl-ppcre::+probe-depth+
common-lisp
.
fixnum
0
common-lisp
.
(simple-array character (*))
(cl-ppcre::make-char-vector 12)
LEXER structures are used to hold the regex string which is currently lexed and to keep track of the lexer’s state.
ALTERNATION objects represent alternations of regexes. (Like "a|b" ist the alternation of "a" or "b".)
A list of REGEX objects
cons
:choices
ANCHOR objects represent anchors like "^" or "$".
Whether this is a "start anchor".
:startp
This slot is read-only.
Whether we’re in multi-line mode,
i.e. whether each #\Newline is surrounded by anchors.
:multi-line-p
This slot is read-only.
Whether we ignore #\Newline at the end.
:no-newline-p
This slot is read-only.
BACK-REFERENCE objects represent backreferences.
The number of the register this reference refers to.
fixnum
:num
num
.
The name of the register this reference refers to or NIL.
:name
name
.
Whether we check case-insensitively.
:case-insensitive-p
This slot is read-only.
BRANCH objects represent Perl’s conditional regular expressions.
The test of this branch, one of LOOKAHEAD, LOOKBEHIND, or a number.
:test
test
.
The regex that’s to be matched if the test succeeds.
:then-regex
The regex that’s to be matched if the test fails.
(make-instance (quote cl-ppcre::void))
:else-regex
CHAR-CLASS objects represent character classes.
A unary function (accepting a
character) which stands in for the character class and does the work
of checking whether a character belongs to the class.
(or function symbol nil)
:test-function
This slot is read-only.
EVERYTHING objects represent regexes matching "everything", i.e. dots.
Whether we’re in single-line mode, i.e. whether we also match #\Newline.
:single-line-p
This slot is read-only.
FILTER objects represent arbitrary functions defined by the user.
The user-defined function.
LOOKAHEAD objects represent look-ahead assertions.
The REGEX object we’re checking.
:regex
LOOKBEHIND objects represent look-behind assertions.
The REGEX object we’re checking.
:regex
Whether this assertion is positive.
:positivep
This slot is read-only.
The REGEX base class. All other classes inherit from this one.
REGISTER objects represent register groups.
The inner regex.
:regex
REPETITION objects represent repetitions of regexes.
compute-min-rest
.
compute-offsets
.
contains-register-p
.
copy-regex
.
create-constant-repetition-constant-length-matcher
.
create-constant-repetition-matcher
.
create-greedy-constant-length-matcher
.
create-greedy-matcher
.
create-greedy-no-zero-matcher
.
create-matcher-aux
.
create-non-greedy-constant-length-matcher
.
create-non-greedy-matcher
.
create-non-greedy-no-zero-matcher
.
everythingp
.
greedyp
.
len
.
(setf maximum)
.
maximum
.
min-len
.
(setf min-rest)
.
min-rest
.
(setf minimum)
.
minimum
.
print-object
.
(setf regex)
.
regex
.
regex-length
.
regex-min-length
.
remove-registers
.
start-anchored-p
.
The REGEX that’s repeated.
:regex
Whether the repetition is greedy.
:greedyp
This slot is read-only.
The minimal number of repetitions.
fixnum
:minimum
The maximal number of repetitions. Can be NIL for unbounded.
:maximum
The minimal length of the enclosed regex.
:min-len
This slot is read-only.
The length of the enclosed regex. NIL if unknown.
:len
len
.
This slot is read-only.
The minimal number of characters which must appear after this repetition.
fixnum
0
Whether the regex contains a register.
:contains-register-p
This slot is read-only.
SEQ objects represents sequences of regexes. (Like "ab" is the sequence of "a" and "b".)
A list of REGEX objects.
cons
:elements
A standalone regular expression.
The inner regex.
:regex
STR objects represent string.
case-insensitive-p
.
case-mode
.
compute-min-rest
.
compute-offsets
.
copy-regex
.
create-matcher-aux
.
end-string-aux
.
initialize-instance
.
(setf len)
.
len
.
(setf offset)
.
offset
.
print-object
.
regex-length
.
regex-min-length
.
(setf skip)
.
skip
.
(setf start-of-end-string-p)
.
start-of-end-string-p
.
(setf str)
.
str
.
If we match case-insensitively.
:case-insensitive-p
This slot is read-only.
Offset from the left of the whole
parse tree. The first regex has offset 0. NIL if unknown, i.e. behind
a variable-length regex.
If we can avoid testing for this
string because the SCAN function has done this already.
:skip
skip
.
If this is the unique
STR which starts END-STRING (a slot of MATCHER).
VOID objects represent empty regular expressions.
WORD-BOUNDARY objects represent word-boundary assertions.
Jump to: | %
(
*
A B C D E F G H I L M N O P Q R S T U W |
---|
Jump to: | %
(
*
A B C D E F G H I L M N O P Q R S T U W |
---|
Jump to: | *
+
C D E F G L M N O P R S T V |
---|
Jump to: | *
+
C D E F G L M N O P R S T V |
---|
Jump to: | A B C E F L O P R S U V W |
---|
Jump to: | A B C E F L O P R S U V W |
---|