The clj-re Reference Manual

Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

The clj-re Reference Manual

This is the clj-re Reference Manual, version 0.1.0, generated automatically by Declt version 3.0 "Montgomery Scott" on Wed Oct 13 10:27:11 2021 GMT+0.


Next: , Previous: , Up: Top   [Contents][Index]

1 Introduction

clj-re - clojure style regular expression functions

This package wraps cl-ppcre's regexp handling, which is nearly identical to java.util.regex.Pattern which in turn is used by Clojure, in a series of regexp supporting functions that attempt to behave like their Clojure namesakes.

It provides the following functions:

#:re-find
#:re-groups
#:re-matcher
#:re-matches
#:re-pattern
#:re-quote-replacement           ;clojure.string/re-quote-replacement
#:re-replace                     ;clojure.string/replace, distinct from clojure.core/replace
#:re-replace-first               ;clojure.string/replace-first
#:re-seq
#:re-split                       ;clojure.string/split

Successfully tested on sbcl and clisp.

Differences from Clojure

clojure.string namespace functions

Clojure has several functions which are normally in the clojure.string namespace that do not reside in separate Common Lisp packages (they're all in the :clj-re). Following are the clojure.string functions and what we have called them here:

We could have left replace-first alone, but with every other exported symbol in this package prefixed with 're', it seemed like the consistent thing to do.

No provision for #"pattern" regular expression literals.

Clojure supports a 'regular expression literal' syntax of the form #"pattern" - note the sharpsign. We could have added a #"" readtable syntax here, but it would conflict with the syntax used for C-style literals in the trivial-escapes package. Also, #"" isn't technically a standards-compliant dispatch sequence.

Similarly, the #p readtable entry is customarily used for pathnames ('p' might have been nice for 'patterns'), and #r is used for lisp radix specifications ('r' might have been nice for 'regex'). These are also are not spec-compliant for user-defined readtable entries.

The regular expression literal syntax (if you have it) means you do not need to double-escape regular expression constructs such as \d. So for now you're stuck having to double-escape such constructs, i.e. \\d.

It is really useful to have the additional notation because double-escaping regex constructs gets old really fast. Just remember princ is your friend for debugging escape-related pitfalls.

For example, which regexp matches "\\\\" with \{n} notation?

`"\\{2}"`
or
`"\\\\{2}"`

Princ makes it clearer. The answer is the second, but once you get a big old string full of \\\\ sequences readability goes into the toilet. Clojure's regular expression literal goes a long way to making it more readable. Perhaps someday we'll have something.

Named capturing groups (a.k.a. registers) are not supported.

Clojure/java has them, cl-ppcre has them, this was purely laziness on my part since I never use them.

Usage

(ql:quickload :clj-re)
(use-package :clj-re)
(re-find "a*b" "aaab") => "aaab"

or to test

(ql:quickload :clj-re-test)
(clj-re-test:run-tests)

See unit tests for more examples.

Adjusting for the previously mentioned caveats:

  1. Renaming of functions from the clojure.string namespace.
  2. Replacing regexp literal syntax (#"") with doubly-escaped string regexps.
  3. Replacement of vector results with list results.
  4. Missing support for named capture groups.

You will hopefully find this sufficient for casual Clojureish regexp needs. If you're going to do performance/memory critical stuff, I suggest you learn to use cl-ppcre directly because issues like string sharing and pattern compilation may be important for your app.


Next: , Previous: , Up: Top   [Contents][Index]

2 Systems

The main system appears first, followed by any subsystem dependency.


Previous: , Up: Systems   [Contents][Index]

2.1 clj-re

Author

Dave Tenny

License

MIT

Description

Implements Clojure-styled regexp operations such as ‘re-matches‘ and ‘re-find‘.

Version

0.1.0

Dependency

cl-ppcre

Source

clj-re.asd (file)

Components

Next: , Previous: , Up: Top   [Contents][Index]

3 Files

Files are sorted by type and then listed depth-first from the systems components trees.


Previous: , Up: Files   [Contents][Index]

3.1 Lisp


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.1 clj-re.asd

Location

clj-re.asd

Systems

clj-re (system)

Packages

clj-re-asd


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.2 clj-re/package.lisp

Parent

clj-re (system)

Location

package.lisp

Packages

clj-re


Previous: , Up: Lisp files   [Contents][Index]

3.1.3 clj-re/clj-re.lisp

Parent

clj-re (system)

Location

clj-re.lisp

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

4 Packages

Packages are listed by definition order.


Next: , Previous: , Up: Packages   [Contents][Index]

4.1 clj-re-asd

Source

clj-re.asd

Use List

Previous: , Up: Packages   [Contents][Index]

4.2 clj-re

Functions that implement Clojure style regexp operations.

Source

package.lisp (file)

Use List
Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


Next: , Previous: , Up: Definitions   [Contents][Index]

5.1 Exported definitions


Previous: , Up: Exported definitions   [Contents][Index]

5.1.1 Functions

Function: re-find MATCHER-OR-REGEXP &optional STRING

Attempts to find the next subsequence of the input sequence that matches the pattern, as per java.util.regex.Matcher.find(). Uses re-groups to return the groups.

Returns:
* If there no match, nil.
* If there is a match, but no groups, returns the match. E.g.
(re-find "a*b" "ab") => "ab"
* If there is a match groups are involved, returns a list whose car is the full match, an whose remaining elements are the matched groups.

Call as ‘(re-find matcher)‘ or ‘(re-find regexp string)‘.

Note that repeated calls to a matcher act like an iterator, while repeated calls with regexp and string arguments do not.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-groups MATCHER

Returns the groups from the most recent call to ‘re-find‘ or ‘re-matches‘. If there are no nested groups, returns a string of the entire match. If there are
nested groups, returns a list of the groups, the first element
being the entire match. Returns nil if there is no match indicated.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-matcher REGEXP STRING

Returns a matcher for use in operations such as ‘re-groups‘ or ‘re-find‘.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-matches REGEXP STRING

Attempts to match the _entire region_ of string against the pattern as per java.util.regex.Matcher.matches(). Uses re-groups to return the groups. Returns nil if the pattern doesn’t match the entire string.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-pattern S

In Clojure: Returns an instance of java.util.regex.Pattern, for use, e.g. in re-matcher. In Common Lisp: just returns the input string, as we don’t currently have compiled pattern API. If we did we’d return a cl-ppcre scanner. Meanwhile, ’s’ can be passed to re-matcher.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-quote-replacement REPLACEMENT-STRING

In clojure this would be ‘clojure.string/re-quote-replacement‘.

Given a replacement string that you wish to be a literal
replacement for a pattern match in ‘re-replace‘ or ‘re-replace-first‘, do the necessary escaping of special characters in the replacement.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-replace STRING MATCH REPLACEMENT

In clojure this would be ‘clojure.string/replace‘.

Replaces all instances of ’match’ with ’replacement’ in ’string’.
If there are no matches, the input ’string’ value is returned.

Note that if you want more power, use ‘cl-ppcreregex-replace[-all]‘ instead, but that
it has different replacement directives which are disabled for clojure compatability.

‘match‘/‘replacement‘ can be:

string / string
char / char
pattern / (string or function of match).

See also ‘re-replace-first‘.

Note that, at least at present, this function doesn’t know whether ‘’match‘’ is a pattern or a string that isn’t meant to be a pattern, since we presently use strings
for both. So match will always be treated as a pattern. If you want it not to be interpreted as a pattern, use ‘(cl-ppcre:quote-meta-chars match)‘. *FINISH* may need somnething different like re-quote-match??

If replacement is a function it should take one argument (the match to be replaced)
and return the replacement value. Note that the argument could be a list if the pattern contains capturing groups (as per ‘re-groups‘), i.e. ‘(match, register1, register2, ...)‘.

The ‘replacement‘ is literal (i.e. none of its characters are treated
specially) for all cases above except pattern / string.

For pattern / string, $1, $2, etc. in the replacement string are
substituted with the string that matched the corresponding
parenthesized group in the pattern. If you wish your replacement
string ’r’ to be used literally, use ‘(re-quote-replacement r)‘ as the
replacement argument. See also documentation for
java.util.regex.Matcher’s appendReplacement method.

Example:
‘(/re-replace "Almost Pig Latin" "\b(\w)(\w+)\b" "$2$1ay")
-> "lmostAay igPay atinLay"

Package

clj-re

Source

clj-re.lisp (file)

Function: re-replace-first STRING MATCH REPLACEMENT

In clojure this would be ‘clojure.string/replace-first‘.

Replaces the _first_ instance of ’match’ with ’replacement’ in ’string’. See ‘re-replace‘ for argument syntax and semantics.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-seq REGEXP STRING

Returns a list of successive matches of pattern in string as by
using java.util.regex.Matcher.find(), each such match processed with re-groups. Note that the clojure version would return a lazy sequence, but we don’t have those.

Package

clj-re

Source

clj-re.lisp (file)

Function: re-split STRING REGEXP &optional LIMIT

Splits string on a regular expression. Optional argument limit is
the maximum number of splits. Returns list of the splits.

Resulting strings do not share structure with the input.

Note that capture groups (cl-ppcre registers) have no effect on the operation except perhaps to make it perform more slowly

Package

clj-re

Source

clj-re.lisp (file)


Previous: , Up: Definitions   [Contents][Index]

5.2 Internal definitions


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.1 Functions

Function: clojure-replacement-translation REPLACEMENT-STRING

Translate clojure-style replacement operations, e.g. $1, into cl-ppcre replacement operations e.g. \1. Do NOT do the translations if the replacement string has ben quoted by the user via re-quote-replacement. Presently assumes repalcement-string has NOT been quoted with cl-ppcre:quote-meta-chars, but more selectively quoted just to ’literalize’ cl-ppcre replacement directives like N.

Package

clj-re

Source

clj-re.lisp (file)

Function: copy-matcher INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: make-matcher &key (SCANNER SCANNER) (STRING STRING) (DONE? DONE?) (MATCH-START MATCH-START) (MATCH-END MATCH-END) (REG-STARTS REG-STARTS) (REG-ENDS REG-ENDS)
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-done? INSTANCE
Function: (setf matcher-done?) VALUE INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-match-end INSTANCE
Function: (setf matcher-match-end) VALUE INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-match-start INSTANCE
Function: (setf matcher-match-start) VALUE INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-p OBJECT
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-reg-ends INSTANCE
Function: (setf matcher-reg-ends) VALUE INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-reg-starts INSTANCE
Function: (setf matcher-reg-starts) VALUE INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-scanner INSTANCE
Function: (setf matcher-scanner) VALUE INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: matcher-string INSTANCE
Function: (setf matcher-string) VALUE INSTANCE
Package

clj-re

Source

clj-re.lisp (file)

Function: next MATCHER

Find the next match, return nil if there aren’t any T if there are, in which case the matcher values are updated.

Package

clj-re

Source

clj-re.lisp (file)

Function: ppcre-replacement-quoter REPLACEMENT

Given a string which may contain replacement directives for cl-pprcre:replace[-all] quote them, since they have no meaning under clojure replacement syntax and/or semantics. The directives we’re looking for are: N or {N} (for some digit), &, ‘, ’

Package

clj-re

Source

clj-re.lisp (file)

Function: re-replace-aux STRING MATCH REPLACEMENT REPLACE-FN

Does the work of re-replace and re-replace-first.
The logic is identical between the two except for which cl-ppcre function is called, as specified by the last argument.

Package

clj-re

Source

clj-re.lisp (file)


Previous: , Up: Internal definitions   [Contents][Index]

5.2.2 Structures

Structure: matcher ()

Emulate java Matcher binding to a string to be scanned.

Package

clj-re

Source

clj-re.lisp (file)

Direct superclasses

structure-object (structure)

Direct slots
Slot: scanner
Readers

matcher-scanner (function)

Writers

(setf matcher-scanner) (function)

Slot: string
Type

string

Initform

""

Readers

matcher-string (function)

Writers

(setf matcher-string) (function)

Slot: done?
Readers

matcher-done? (function)

Writers

(setf matcher-done?) (function)

Slot: match-start
Type

integer

Initform

0

Readers

matcher-match-start (function)

Writers

(setf matcher-match-start) (function)

Slot: match-end
Type

integer

Initform

0

Readers

matcher-match-end (function)

Writers

(setf matcher-match-end) (function)

Slot: reg-starts
Type

(array integer)

Initform

#()

Readers

matcher-reg-starts (function)

Writers

(setf matcher-reg-starts) (function)

Slot: reg-ends
Type

(array integer)

Initform

#()

Readers

matcher-reg-ends (function)

Writers

(setf matcher-reg-ends) (function)


Previous: , Up: Top   [Contents][Index]

Appendix A Indexes


Next: , Previous: , Up: Indexes   [Contents][Index]

A.1 Concepts

Jump to:   C   F   L  
Index Entry  Section

C
clj-re.asd: The clj-re․asd file
clj-re/clj-re.lisp: The clj-re/clj-re․lisp file
clj-re/package.lisp: The clj-re/package․lisp file

F
File, Lisp, clj-re.asd: The clj-re․asd file
File, Lisp, clj-re/clj-re.lisp: The clj-re/clj-re․lisp file
File, Lisp, clj-re/package.lisp: The clj-re/package․lisp file

L
Lisp File, clj-re.asd: The clj-re․asd file
Lisp File, clj-re/clj-re.lisp: The clj-re/clj-re․lisp file
Lisp File, clj-re/package.lisp: The clj-re/package․lisp file

Jump to:   C   F   L  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.2 Functions

Jump to:   (  
C   F   M   N   P   R  
Index Entry  Section

(
(setf matcher-done?): Internal functions
(setf matcher-match-end): Internal functions
(setf matcher-match-start): Internal functions
(setf matcher-reg-ends): Internal functions
(setf matcher-reg-starts): Internal functions
(setf matcher-scanner): Internal functions
(setf matcher-string): Internal functions

C
clojure-replacement-translation: Internal functions
copy-matcher: Internal functions

F
Function, (setf matcher-done?): Internal functions
Function, (setf matcher-match-end): Internal functions
Function, (setf matcher-match-start): Internal functions
Function, (setf matcher-reg-ends): Internal functions
Function, (setf matcher-reg-starts): Internal functions
Function, (setf matcher-scanner): Internal functions
Function, (setf matcher-string): Internal functions
Function, clojure-replacement-translation: Internal functions
Function, copy-matcher: Internal functions
Function, make-matcher: Internal functions
Function, matcher-done?: Internal functions
Function, matcher-match-end: Internal functions
Function, matcher-match-start: Internal functions
Function, matcher-p: Internal functions
Function, matcher-reg-ends: Internal functions
Function, matcher-reg-starts: Internal functions
Function, matcher-scanner: Internal functions
Function, matcher-string: Internal functions
Function, next: Internal functions
Function, ppcre-replacement-quoter: Internal functions
Function, re-find: Exported functions
Function, re-groups: Exported functions
Function, re-matcher: Exported functions
Function, re-matches: Exported functions
Function, re-pattern: Exported functions
Function, re-quote-replacement: Exported functions
Function, re-replace: Exported functions
Function, re-replace-aux: Internal functions
Function, re-replace-first: Exported functions
Function, re-seq: Exported functions
Function, re-split: Exported functions

M
make-matcher: Internal functions
matcher-done?: Internal functions
matcher-match-end: Internal functions
matcher-match-start: Internal functions
matcher-p: Internal functions
matcher-reg-ends: Internal functions
matcher-reg-starts: Internal functions
matcher-scanner: Internal functions
matcher-string: Internal functions

N
next: Internal functions

P
ppcre-replacement-quoter: Internal functions

R
re-find: Exported functions
re-groups: Exported functions
re-matcher: Exported functions
re-matches: Exported functions
re-pattern: Exported functions
re-quote-replacement: Exported functions
re-replace: Exported functions
re-replace-aux: Internal functions
re-replace-first: Exported functions
re-seq: Exported functions
re-split: Exported functions

Jump to:   (  
C   F   M   N   P   R  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.3 Variables

Jump to:   D   M   R   S  
Index Entry  Section

D
done?: Internal structures

M
match-end: Internal structures
match-start: Internal structures

R
reg-ends: Internal structures
reg-starts: Internal structures

S
scanner: Internal structures
Slot, done?: Internal structures
Slot, match-end: Internal structures
Slot, match-start: Internal structures
Slot, reg-ends: Internal structures
Slot, reg-starts: Internal structures
Slot, scanner: Internal structures
Slot, string: Internal structures
string: Internal structures

Jump to:   D   M   R   S  

Previous: , Up: Indexes   [Contents][Index]

A.4 Data types

Jump to:   C   M   P   S  
Index Entry  Section

C
clj-re: The clj-re system
clj-re: The clj-re package
clj-re-asd: The clj-re-asd package

M
matcher: Internal structures

P
Package, clj-re: The clj-re package
Package, clj-re-asd: The clj-re-asd package

S
Structure, matcher: Internal structures
System, clj-re: The clj-re system

Jump to:   C   M   P   S