The re Reference Manual

Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

The re Reference Manual

This is the re Reference Manual, version 1.0, generated automatically by Declt version 2.4 patchlevel 1 "Will Decker" on Mon Apr 08 14:54:14 2019 GMT+0.


Next: , Previous: , Up: Top   [Contents][Index]

1 Introduction

The RE Package

The re package is a small, portable, lightweight, and quick, regular expression library for Common Lisp. It is a non-recursive, backtracing VM. The syntax is similar to Lua-style pattern patching (found here), but has added support for additional regex features (see below). It's certainly not the fastest, but is very easy to understand and extend.

It makes heavy use of the monadic parse combinator library for parsing the regular expressions. If you'd like to understand the parsing and compiling of regular expressions, I recommend reading up on that library as well.

Compiling Patterns

To create a re object, you can either use the compile-re function or the #r dispatch macro.

CL-USER > (compile-re "%d+")
#<RE "%d+">

CL-USER > #r/%d+/
#<RE "%d+">

Both work equally well, but the dispatch macro will compile the pattern at read-time. The re class has a load form and so can be saved to a FASL file.

HINT: when using the read macro, use a backslash to escape the / and other characters that might mess with syntax coloring.

Finally, the with-re macro let's you user either strings or re objects in a body of code. If a string is passed as the pattern, then it will be compiled before the body is evaluated.

CL-USER > (with-re (re "%d+") re)
#<RE "%d+">

NOTE: All pattern matching functions use the with-re macro, and so the pattern argument can be either a string or a pre-compiled re object.

Basic Pattern Matching

The heart of all pattern matching is the match-re function.

(match-re pattern string &key start end exact)

It will match string against pattern and return a re-match object on success or nil on failure. The start and end arguments limit the scope of the match and default to the entire string. If exact is t then the pattern has to consume the entire string (from start to end).

CL-USER > (match-re "%d+" "abc 123")
NIL

CL-USER > (match-re "%a+" "abc 123")
#<RE-MATCH "abc">

Once you have successfully matched and have a re-match object, you can use the following reader functions to inspect it:

Try peeking into a match...

CL-USER > (inspect (match-re "(a(b(c)))" "abc 123"))
MATCH          "abc"
GROUPS         ("abc" "bc" "c")
START-POS      0
END-POS        3

Pattern Scanning

To find a pattern match anywhere in a string use the find-re function.

(find-re pattern string &key start end all)

It will scan string looking for matches to pattern. If all is non-nil then a list of all matches found is returned, otherwise it will simply be the first match.

CL-USER > (find-re "%d+" "abc 123")
#<RE-MATCH "123">

CL-USER > (find-re "[^%s]+" "abc 123" :all t)
(#<RE-MATCH "abc">
 #<RE-MATCH "123">)

Splitting by Pattern

Once patterns have been matched, splitting a string from the matches is trivial.

(split-re pattern string &key start end all coalesce-seps)

If all is true, then a list of all sub-sequences in string (delimited by pattern) are returned, otherwise just the first and the rest of the string.

If coalesce-seps is true the sub-sequences that are empty will be excluded from the results. This argument is ignored if all is nil.

CL-USER > (split-re "," "1,2,3")
"1"
"2,3"

CL-USER > (split-re "," "1,2,,,abc,3,," :all t :coalesce-seps t)
("1" "2" "abc" "3")

Replacing by Pattern

The replace-re function scans the string looking for matching sub-sequences that will be replaced with another string.

(replace-re pattern with string &key start end all)

If with is a function, then the function is called with the re-match object, replacing the pattern with the return value. Otherwise the value is used as-is. As with find-re and split-re, if all is true, then the pattern is globally replaced.

CL-USER > (replace-re "%d+" #\* "1 2 3")
"* 2 3"

CL-USER > (replace-re "%a+" #'(lambda (m) (length (match-string m))) "a bc def" :all t)
"1 2 3"

NOTE: The string returned by replace-re is a completely new string. This is true even if pattern isn't found in the string.

Groups

Using parenthesis in a pattern will cause the matching text to be groups in the returned re-match object. The match-groups function will return a list of all the captured strings in the match.

CL-USER > (match-groups (match-re #r/(%d+)(%a+)/ "123abc"))
("123" "abc")

Captures can be nested, but are always returned in the order they are opened.

CL-USER > (match-groups (match-re #r/(a(b(c)))(d)/ "abcd"))
("abc" "bc" "c" "d")

HINT: you can always use the match-string function to get at the full text that was matched and there's no need to capture the entire pattern.

The with-re-match Macro

Whe with-re-match macro can be used to assist in extracting the matched patterns and groups.

(with-re-match ((var match-expr &key no-match) &body body)

If the result of match-expr is nil, then no-match is returned and body is not executed.

While in the body of the macro, $$ will be bound to the match-string and the groups will be bound to $1, $2, ..., $9. Any groups beyond the first 9 are bound in a list to $_. The symbol $* is bound to all the match groups.

CL-USER > (with-re-match (m (match-re "(%a+)(%s+)(%d+)" "abc 123"))
            (string-append $3 $2 $1)))
"123 abc"

CL-USER > (flet ((initial (m)
                   (with-re-match (v m)
                     (format nil "~@(~a~)." $1))))
            (replace-re #r/(%a)%a+%s*/ #'initial "lisp in small pieces" :all t))
"L.I.S.P."

Additional Features

In addition to supporting all of what Lua pattern matching has to offer, it also supports branching with | and uncaptured groups: (?..). For example...

CL-USER > (match-re "(?a|b)+" "abbaaabbccc")
#<RE-MATCH "abbaaabb">

Finally, the re package has one special feature: user-defined character set predicates! Using %:, you can provide a predicate function for the regexp VM to test characters against.

CL-USER > (match-re #r"%:digit-char-p:+" "103")
#<RE-MATCH "103">

The predicate must take a single character and return non-nil if the character matches the predicate function. Note: this is especially handy when parsing unicode strings!

Thank You!

If you get some good use out of this package, please let me know; it's nice to know your work is valued by others.

I'm always improving it; it's the foundation for many of the other packages I've created for JSON parsing, XML parsing, HTTP header parsing, etc.

Should you find/fix a bug or add a nice feature, please feel free to send a pull request or let me know at massung@gmail.com.


Next: , Previous: , Up: Top   [Contents][Index]

2 Systems

The main system appears first, followed by any subsystem dependency.


Previous: , Up: Systems   [Contents][Index]

2.1 re

Author

Jeffrey Massung

License

Apache 2.0

Description

Lua-style string pattern matching.

Version

1.0

Dependency

parse

Source

re.asd (file)

Component

re.lisp (file)


Next: , Previous: , Up: Top   [Contents][Index]

3 Files

Files are sorted by type and then listed depth-first from the systems components trees.


Previous: , Up: Files   [Contents][Index]

3.1 Lisp


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.1 re.asd

Location

re.asd

Systems

re (system)

Packages

re-asd


Previous: , Up: Lisp files   [Contents][Index]

3.1.2 re/re.lisp

Parent

re (system)

Location

re.lisp

Packages

re

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

4 Packages

Packages are listed by definition order.


Next: , Previous: , Up: Packages   [Contents][Index]

4.1 re-asd

Source

re.asd

Use List

Previous: , Up: Packages   [Contents][Index]

4.2 re

Source

re.lisp (file)

Use List
Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


Next: , Previous: , Up: Definitions   [Contents][Index]

5.1 Exported definitions


Next: , Previous: , Up: Exported definitions   [Contents][Index]

5.1.1 Macros

Macro: with-re (RE PATTERN) &body BODY

Compile pattern if it’s not a RE object and execute body.

Package

re

Source

re.lisp (file)

Macro: with-re-match (MATCH MATCH-EXPR &key NO-MATCH) &body BODY

Intern match symbols to execute a body.

Package

re

Source

re.lisp (file)


Next: , Previous: , Up: Exported definitions   [Contents][Index]

5.1.2 Functions

Function: compile-re PATTERN

Create a regular expression from a pattern string.

Package

re

Source

re.lisp (file)

Function: find-re PATTERN S &key ALL START END

Find a regexp pattern match somewhere in a string.

Package

re

Source

re.lisp (file)

Function: match-re PATTERN S &key EXACT START END

Test a pattern re against a string.

Package

re

Source

re.lisp (file)

Function: replace-re PATTERN WITH S &key ALL START END

Replace patterns found within a string with a new value.

Package

re

Source

re.lisp (file)

Function: split-re PATTERN S &key ALL COALESCE-SEPS START END

Split a string into one or more strings by pattern match.

Package

re

Source

re.lisp (file)


Previous: , Up: Exported definitions   [Contents][Index]

5.1.3 Generic functions

Generic Function: match-groups OBJECT
Package

re

Methods
Method: match-groups (RE-MATCH re-match)

automatically generated reader method

Source

re.lisp (file)

Generic Function: match-pos-end OBJECT
Package

re

Methods
Method: match-pos-end (RE-MATCH re-match)

automatically generated reader method

Source

re.lisp (file)

Generic Function: match-pos-start OBJECT
Package

re

Methods
Method: match-pos-start (RE-MATCH re-match)

automatically generated reader method

Source

re.lisp (file)

Generic Function: match-string OBJECT
Package

re

Methods
Method: match-string (RE-MATCH re-match)

automatically generated reader method

Source

re.lisp (file)


Previous: , Up: Definitions   [Contents][Index]

5.2 Internal definitions


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.1 Functions

Function: copy-re-thread INSTANCE
Package

re

Source

re.lisp (file)

Function: escape STREAM

Return the test and predicate for an escaped character.

Package

re

Source

re.lisp (file)

Function: hex-char-p C

T if c is a hexadecimal character.

Package

re

Source

re.lisp (file)

Function: is-not PRED

Create a predicate that tests the inverse.

Package

re

Source

re.lisp (file)

Function: make-re-thread PC SP GROUPS STACK
Package

re

Source

re.lisp (file)

Function: match S THREAD START OFFSET

Create a re-match from a thread that matched.

Package

re

Source

re.lisp (file)

Function: newline-p C

T if c is a newline character.

Package

re

Source

re.lisp (file)

Function: parse-re PATTERN

Parse a regular expression pattern.

Package

re

Source

re.lisp (file)

Function: punctuation-p C

T if c is a punctuation character.

Package

re

Source

re.lisp (file)

Function: re-boundary G0

The start or end of a string.

Package

re

Source

re.lisp (file)

Function: re-bounds G0

Lua-style %b bounds.

Package

re

Source

re.lisp (file)

Function: re-char G0

Match any character, exact character, or predicate function.

Package

re

Source

re.lisp (file)

Function: re-expr G0

A single character, set, or loop of expressions.

Package

re

Source

re.lisp (file)

Function: re-group G0

Match an optionally captured group.

Package

re

Source

re.lisp (file)

Function: re-parser G0

A regular expression is one or more expressions.

Package

re

Source

re.lisp (file)

Function: re-set G0

Match from a set of characters.

Package

re

Source

re.lisp (file)

Function: re-set-char G0

Valid characters in a character set.

Package

re

Source

re.lisp (file)

Function: re-set-chars G0

Characters, character ranges, and named character sets.

Package

re

Source

re.lisp (file)

Function: re-thread-groups INSTANCE
Function: (setf re-thread-groups) VALUE INSTANCE
Package

re

Source

re.lisp (file)

Function: re-thread-p OBJECT
Package

re

Source

re.lisp (file)

Function: re-thread-pc INSTANCE
Function: (setf re-thread-pc) VALUE INSTANCE
Package

re

Source

re.lisp (file)

Function: re-thread-sp INSTANCE
Function: (setf re-thread-sp) VALUE INSTANCE
Package

re

Source

re.lisp (file)

Function: re-thread-stack INSTANCE
Function: (setf re-thread-stack) VALUE INSTANCE
Package

re

Source

re.lisp (file)

Function: run ()

Execute a regular expression program.

Package

re

Source

re.lisp (file)

Function: space-p C

T if c is a whitespace character.

Package

re

Source

re.lisp (file)

Function: tab-p C

T if c is a tab character.

Package

re

Source

re.lisp (file)

Function: word-char-p C

T if is alphanumeric or an underscore.

Package

re

Source

re.lisp (file)


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.2 Generic functions

Generic Function: re-expression OBJECT
Package

re

Methods
Method: re-expression (RE re)

automatically generated reader method

Source

re.lisp (file)

Generic Function: re-pattern OBJECT
Package

re

Methods
Method: re-pattern (RE re)

automatically generated reader method

Source

re.lisp (file)


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.3 Structures

Structure: re-thread ()
Package

re

Source

re.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Slot: pc
Readers

re-thread-pc (function)

Writers

(setf re-thread-pc) (function)

Slot: sp
Readers

re-thread-sp (function)

Writers

(setf re-thread-sp) (function)

Slot: groups
Readers

re-thread-groups (function)

Writers

(setf re-thread-groups) (function)

Slot: stack
Readers

re-thread-stack (function)

Writers

(setf re-thread-stack) (function)


Previous: , Up: Internal definitions   [Contents][Index]

5.2.4 Classes

Class: re ()

Regular expression.

Package

re

Source

re.lisp (file)

Direct superclasses

standard-object (class)

Direct methods
Direct slots
Slot: pattern
Initargs

:pattern

Readers

re-pattern (generic function)

Slot: expr
Initargs

:expression

Readers

re-expression (generic function)

Class: re-match ()

Matched pattern.

Package

re

Source

re.lisp (file)

Direct superclasses

standard-object (class)

Direct methods
Direct slots
Slot: match
Initargs

:match

Readers

match-string (generic function)

Slot: groups
Initargs

:groups

Readers

match-groups (generic function)

Slot: start-pos
Initargs

:start-pos

Readers

match-pos-start (generic function)

Slot: end-pos
Initargs

:end-pos

Readers

match-pos-end (generic function)


Previous: , Up: Top   [Contents][Index]

Appendix A Indexes


Next: , Previous: , Up: Indexes   [Contents][Index]

A.1 Concepts

Jump to:   F   L   R  
Index Entry  Section

F
File, Lisp, re.asd: The re<dot>asd file
File, Lisp, re/re.lisp: The re/re<dot>lisp file

L
Lisp File, re.asd: The re<dot>asd file
Lisp File, re/re.lisp: The re/re<dot>lisp file

R
re.asd: The re<dot>asd file
re/re.lisp: The re/re<dot>lisp file

Jump to:   F   L   R  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.2 Functions

Jump to:   (  
C   E   F   G   H   I   M   N   P   R   S   T   W  
Index Entry  Section

(
(setf re-thread-groups): Internal functions
(setf re-thread-pc): Internal functions
(setf re-thread-sp): Internal functions
(setf re-thread-stack): Internal functions

C
compile-re: Exported functions
copy-re-thread: Internal functions

E
escape: Internal functions

F
find-re: Exported functions
Function, (setf re-thread-groups): Internal functions
Function, (setf re-thread-pc): Internal functions
Function, (setf re-thread-sp): Internal functions
Function, (setf re-thread-stack): Internal functions
Function, compile-re: Exported functions
Function, copy-re-thread: Internal functions
Function, escape: Internal functions
Function, find-re: Exported functions
Function, hex-char-p: Internal functions
Function, is-not: Internal functions
Function, make-re-thread: Internal functions
Function, match: Internal functions
Function, match-re: Exported functions
Function, newline-p: Internal functions
Function, parse-re: Internal functions
Function, punctuation-p: Internal functions
Function, re-boundary: Internal functions
Function, re-bounds: Internal functions
Function, re-char: Internal functions
Function, re-expr: Internal functions
Function, re-group: Internal functions
Function, re-parser: Internal functions
Function, re-set: Internal functions
Function, re-set-char: Internal functions
Function, re-set-chars: Internal functions
Function, re-thread-groups: Internal functions
Function, re-thread-p: Internal functions
Function, re-thread-pc: Internal functions
Function, re-thread-sp: Internal functions
Function, re-thread-stack: Internal functions
Function, replace-re: Exported functions
Function, run: Internal functions
Function, space-p: Internal functions
Function, split-re: Exported functions
Function, tab-p: Internal functions
Function, word-char-p: Internal functions

G
Generic Function, match-groups: Exported generic functions
Generic Function, match-pos-end: Exported generic functions
Generic Function, match-pos-start: Exported generic functions
Generic Function, match-string: Exported generic functions
Generic Function, re-expression: Internal generic functions
Generic Function, re-pattern: Internal generic functions

H
hex-char-p: Internal functions

I
is-not: Internal functions

M
Macro, with-re: Exported macros
Macro, with-re-match: Exported macros
make-re-thread: Internal functions
match: Internal functions
match-groups: Exported generic functions
match-groups: Exported generic functions
match-pos-end: Exported generic functions
match-pos-end: Exported generic functions
match-pos-start: Exported generic functions
match-pos-start: Exported generic functions
match-re: Exported functions
match-string: Exported generic functions
match-string: Exported generic functions
Method, match-groups: Exported generic functions
Method, match-pos-end: Exported generic functions
Method, match-pos-start: Exported generic functions
Method, match-string: Exported generic functions
Method, re-expression: Internal generic functions
Method, re-pattern: Internal generic functions

N
newline-p: Internal functions

P
parse-re: Internal functions
punctuation-p: Internal functions

R
re-boundary: Internal functions
re-bounds: Internal functions
re-char: Internal functions
re-expr: Internal functions
re-expression: Internal generic functions
re-expression: Internal generic functions
re-group: Internal functions
re-parser: Internal functions
re-pattern: Internal generic functions
re-pattern: Internal generic functions
re-set: Internal functions
re-set-char: Internal functions
re-set-chars: Internal functions
re-thread-groups: Internal functions
re-thread-p: Internal functions
re-thread-pc: Internal functions
re-thread-sp: Internal functions
re-thread-stack: Internal functions
replace-re: Exported functions
run: Internal functions

S
space-p: Internal functions
split-re: Exported functions

T
tab-p: Internal functions

W
with-re: Exported macros
with-re-match: Exported macros
word-char-p: Internal functions

Jump to:   (  
C   E   F   G   H   I   M   N   P   R   S   T   W  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.3 Variables

Jump to:   E   G   M   P   S  
Index Entry  Section

E
end-pos: Internal classes
expr: Internal classes

G
groups: Internal structures
groups: Internal classes

M
match: Internal classes

P
pattern: Internal classes
pc: Internal structures

S
Slot, end-pos: Internal classes
Slot, expr: Internal classes
Slot, groups: Internal structures
Slot, groups: Internal classes
Slot, match: Internal classes
Slot, pattern: Internal classes
Slot, pc: Internal structures
Slot, sp: Internal structures
Slot, stack: Internal structures
Slot, start-pos: Internal classes
sp: Internal structures
stack: Internal structures
start-pos: Internal classes

Jump to:   E   G   M   P   S  

Previous: , Up: Indexes   [Contents][Index]

A.4 Data types

Jump to:   C   P   R   S  
Index Entry  Section

C
Class, re: Internal classes
Class, re-match: Internal classes

P
Package, re: The re package
Package, re-asd: The re-asd package

R
re: The re system
re: The re package
re: Internal classes
re-asd: The re-asd package
re-match: Internal classes
re-thread: Internal structures

S
Structure, re-thread: Internal structures
System, re: The re system

Jump to:   C   P   R   S