The lol-re Reference Manual

Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

The lol-re Reference Manual

This is the lol-re Reference Manual, version 0.1, generated automatically by Declt version 2.3 "Robert April" on Tue Feb 20 09:02:07 2018 GMT+0.


Next: , Previous: , Up: Top   [Contents][Index]

1 Introduction

lol-re

Tiny wrapper around CL-PPCRE, making usage of regexps more perly. Inspired by let-over-lambda's #~m and #~s read-macro (http://www.letoverlambda.com)

This package introduces two car-reader-macro (see CL-READ-MACRO-TOKENS) M~ and S~ (and also MR~, see below). M~ is for matching, MR~ is for matching with autoreset, while S~ is for substitution (in direct analogy with Perl's '=~ m//' and ' =~ s///' idioms).

M~

Syntax:

m~ regexp [string] => function | (or null string)

Basic example:

(with-open-file (out-file "out-file" :direction :output)
  (iter (for line in-file "in-file" using #'readline)
        (and (m~ "(some)regexp(?<with>with)grouping(s)" line)
             ;; plenty of anaphoric bindings are available after the match
             (format out-file #?"$($1) $($2) $($with) $($3)"))))

First argument of M~ is read in with reader of #?r" installed for ". Thus, you don't have to escape backslashes, appearing in front of regexp-meaning characters. Unfortunately, interpolation is not supported for now. Only double quotes are chosen as a delimiter, as otherwise lisp-mode of Emacs would go crazy when seeing something like this: /"/

When called with one argument (regexp), M~ expands into closure, which accepts string. When called with same string repeatedly, it outputs subsequent matches of the regexp on that string.

LOL-RE> (defparameter matcher (m~ "[a-z]"))
LOL-RE> (funcall matcher "asdf")
"a"
LOL-RE> (funcall matcher "asdf")
"s"
LOL-RE> (funcall matcher "asdf")
"d"
LOL-RE> (funcall matcher "asdf")
"f"
LOL-RE> (funcall matcher "asdf")
NIL

When called with different strings, behavior may be strange.

LOL-RE> (defparameter matcher (m~ "[a-z]"))
LOL-RE> (funcall matcher "foo")
"f"
LOL-RE> (funcall matcher "bar") ; matching starts from the position, where first match finished
"a"

However, when called with :RESET keyword, the position counter inside the closure is reset, so closure can be called now on some new string (see MR~ below).

LOL-RE> (defparameter matcher (m~ "[a-z]"))
LOL-RE> (funcall matcher "foo")
"f"
LOL-RE> (funcall matcher :reset)
T
LOL-RE> (funcall matcher "bar")
"b"

When called with two arguments (regexp and string), M~ expands into application of a matching closure to that string, so the first match of regexp on that string.

LOL-RE> (m~ "[0-9]+" "foo123bar456")
"123"

M~ also sets some anaphoric bindings (as seen in the basic example):

Since all those anaphoric bindings are (by default) global dynamic variables

MR~

Since M~ macro generates matcher-closure, which remembers position, from which to perform the next match, it may behave strangely in seemingly obvious situations.

LOL-RE> (dolist (elt '("1" "2" "3"))
          (format t "~a" (m~ "[0-9]" elt)))
"1"
NIL
NIL

This is because, after the first match, position remembered is already 1, and further matches do not succeed.

This intuitive behavior is provided by MR~ macro (from "match resettingly"), which resets its position to 0, when performing each new match.

LOL-RE> (dolist (elt '("1" "2" "3"))
          (format t "~a" (mr~ "[0-9]" elt)))
"1"
"2"
"3"

Of course, the most intuitive solution (and the most performance penalizing), would be to maintain a hash of positions for each string being matched by the matcher, generated by M~ macro. However, then the behavior in this example

LOL-RE> (dolist (elt '("a" "a" "a"))
          (format t "~a" (mr~ "[0-9]" elt)))
???

would crucially depend on whether strings share structure, which may depend on details of the compiler, which is even more obscuring, than current situation with two macro (M~ and MR~), each of which behaves in the definite way.

Iterate drivers

System also defines two drivers for iterate: IN-MATCHES-OF and MATCHING

(iter (for match in-matches-of "asdf" using (m~ "[a-z]([a-z])"))
      (collect `(,match ,$0 ,$1)))
(("as" "as" "s") ("df" "df" "f"))

As seen from the example, IN-MATCHES-OF iterates over all matches of given regexp in a given string. Both string and regexp are evaluated once-only, in the initialization of the loop.

In contrast, first example could be rewritten using MATCHING driver as follows:

(with-open-file (out-file "out-file" :direction :output)
  (iter (for line in-file "in-file" using #'readline)
        (for match matching line using (m~ "(some)regexp(?<with>with)grouping(s)"))
	(format out-file #?"$($1) $($2) $($with) $($3)")))

but there are couple important things, which MATCHING does differently:

So, MATCHING is more-or-less analogous to

(let ((matcher (m~ "(some)regexp(?<with>with)grouping(s)")))
  (with-open-file (out-file "out-file" :direction :output)
    (iter (for line in-file "in-file" using #'readline)
          (funcall matcher line)
          (format out-file #?"$($1) $($2) $($with) $($3)"))))

TODO:

For more usage patterns, see tests.lisp file and use-cases.lisp. use-cases.lisp was assembled by grepping of some quicklisp-available libs and rewriting CL-PPCRE-using pieces with help of M~ and S~.

S~

For now, replacing only can replace first occurence of the match. But, still, it's not needed to escape all those backslashes.

LOL-RE> (s~ "(\d{4})-(\d{2})-(\d{2})" "\3/\2/\1" "2014-04-07")
07/04/2014

When called with just 2 arguments, generates replacer closure

LOL-RE> (funcall (s~ "(\d{4})-(\d{2})-(\d{2})" "\3/\2/\1") "2014-04-07")
07/04/2014

TODO:

re-local

The purpose of this macro is to tackle issues, that arise when multithreading.

(re-local (all-the-variables like $1 $2 $a)
          (arising-from-use-of-m~)
          (are-declared-local-special)
          (inside-re-local))

So

;; this is not thread-safe, as some other thread may corrupt $1 before PRINC gets executed
(and (m~ "foo(bar)") (princ $1))

But

;; this is (supposedly) thread-safe, as all the relevant variables are implicitly
;; rebound as local dynamic, which are per-thread
(re-local (and (m~ "foo(bar)") (princ $1)))

How it works: codewalks (with help of HU.DWIM.WALKER) the body, with M~ and S~ redefined as MACROLETs, with same expansion, but with side-effect of telling RE-LOCAL, what variables they are going to initialize.

N.B.: If M~ was to be defined fully as read-time macro, then it's not possible to write RE-LOCAL even using code-walking, since it's not possible (read: very hard and ugly) to read the form twice from a stream. So, I won't define M~ to be read-time macro, at cost of sometimes being required to write FUNCALL.

Gotchas


Next: , Previous: , Up: Top   [Contents][Index]

2 Systems

The main system appears first, followed by any subsystem dependency.


Previous: , Up: Systems   [Contents][Index]

2.1 lol-re

Author

Alexander Popolitov <popolit@gmail.com>

License

GPL

Description

Small set of wrappers around CL-PPCRE in spirit of Let Over Lambda.

Version

0.1

Dependencies
Source

lol-re.asd (file)

Components

Next: , Previous: , Up: Top   [Contents][Index]

3 Files

Files are sorted by type and then listed depth-first from the systems components trees.


Previous: , Up: Files   [Contents][Index]

3.1 Lisp


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.1 lol-re.asd

Location

lol-re.asd

Systems

lol-re (system)


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.2 lol-re/syntax.lisp

Parent

lol-re (system)

Location

syntax.lisp

Packages

lol-re.syntax

Internal Definitions

Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.3 lol-re/package.lisp

Dependency

syntax.lisp (file)

Parent

lol-re (system)

Location

package.lisp

Packages

lol-re


Previous: , Up: Lisp files   [Contents][Index]

3.1.4 lol-re/lol-re.lisp

Dependency

package.lisp (file)

Parent

lol-re (system)

Location

lol-re.lisp

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

4 Packages

Packages are listed by definition order.


Next: , Previous: , Up: Packages   [Contents][Index]

4.1 lol-re.syntax

Source

syntax.lisp (file)

Use List

common-lisp

Internal Definitions

Previous: , Up: Packages   [Contents][Index]

4.2 lol-re

Source

package.lisp (file)

Use List
Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


Next: , Previous: , Up: Definitions   [Contents][Index]

5.1 Exported definitions


Previous: , Up: Exported definitions   [Contents][Index]

5.1.1 Macros

Macro: mr~ REGEX-SPEC &optional ARGUMENT
Package

lol-re

Source

lol-re.lisp (file)

Macro: m~ REGEX-SPEC &optional ARGUMENT
Package

lol-re

Source

lol-re.lisp (file)

Macro: re-local &body BODY
Package

lol-re

Source

lol-re.lisp (file)

Macro: s~ REGEX-SPEC TARGET-SPEC &optional ARGUMENT
Package

lol-re

Source

lol-re.lisp (file)


Previous: , Up: Definitions   [Contents][Index]

5.2 Internal definitions


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.1 Special variables

Special Variable: *re-local-vars*
Package

lol-re

Source

lol-re.lisp (file)


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.2 Symbol macros

Symbol Macro: regex
Package

lol-re.syntax

Source

syntax.lisp (file)

Expansion

(sb-int:quasiquote (if (zerop (length #s(sb-impl::comma :expr lol-re.syntax::o!-mods :kind 0))) (car #s(sb-impl::comma :expr lol-re.syntax::o!-args :kind 0)) (format nil "(?~a)~a" (remove #\latin_small_letter_g #s(sb-impl::comma :expr lol-re.syntax::o!-mods :kind 0)) (car #s(sb-impl::comma :expr lol-re.syntax::o!-args :kind 0)))))


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.3 Macros

Macro: clause-for-in-matches-of-using-1 &key (FOR VAR) (IN-MATCHES-OF STR) (USING MATCH) GENERATE
Package

lol-re

Source

lol-re.lisp (file)

Macro: clause-for-matching-using-2 &key (FOR VAR) (MATCHING STR) (USING MATCH) GENERATE
Package

lol-re

Source

lol-re.lisp (file)

Macro: ensure-correct-regex-spec REGEX-SPEC-VAR
Package

lol-re

Source

lol-re.lisp (file)

Macro: ifmatch (TEST STR) THEN &optional ELSE

Checks for the existence of group-capturing regex (in /for(bar)baz/, bar is capured) in the TEST and bind $1, $2, $n vars to the captured regex. Obviously, doesn’t work with runtime regexes

Package

lol-re.syntax

Source

syntax.lisp (file)

Macro: match-mode-ppcre-lambda-form O!-ARGS O!-MODS
Package

lol-re.syntax

Source

syntax.lisp (file)

Macro: subst-mode-ppcre-lambda-form O!-ARGS O!-MODS
Package

lol-re.syntax

Source

syntax.lisp (file)

Macro: whenmatch (TEST STR) &body FORMS

(whenmatch (#~m/"(b)(c)(d)(e)"/ "abcdef") (print |$‘|)
(print $2)
(print $4))

Package

lol-re.syntax

Source

syntax.lisp (file)

Macro: with-re-reader-context &body BODY
Package

lol-re

Source

lol-re.lisp (file)

Macro: with-scanner (SCAN-VAR REG-VAR O!-REGEX-SPEC) &body BODY
Package

lol-re

Source

lol-re.lisp (file)


Previous: , Up: Internal definitions   [Contents][Index]

5.2.4 Functions

Function: #~-reader STREAM CHAR NUMARG
Package

lol-re.syntax

Source

syntax.lisp (file)

Function: bind-regs REGISTERS
Package

lol-re

Source

lol-re.lisp (file)

Function: clear-regs REGISTERS
Package

lol-re

Source

lol-re.lisp (file)

Function: define-reg-vars REGISTERS
Package

lol-re

Source

lol-re.lisp (file)

Function: list-o-syms NAME
Package

lol-re

Source

lol-re.lisp (file)

Function: lol-re-literal-string-reader STRING-READER
Package

lol-re

Source

lol-re.lisp (file)

Function: lol-re-string-reader *STREAM* CHAR
Package

lol-re

Source

lol-re.lisp (file)

Function: mk-scan-iter-code SCANNER-CODE
Package

lol-re

Source

lol-re.lisp (file)

Function: mk-scanner-create-code REGEX-SPEC
Package

lol-re

Source

lol-re.lisp (file)

Function: mods STREAM

imsxg modifiers

Package

lol-re.syntax

Source

syntax.lisp (file)

Function: segment-reader STREAM CH N

with m” or s”’ supress string interpolation, camel192

Package

lol-re.syntax

Source

syntax.lisp (file)

Function: string-reverse-case STR
Package

lol-re

Source

lol-re.lisp (file)


Previous: , Up: Top   [Contents][Index]

Appendix A Indexes


Next: , Previous: , Up: Indexes   [Contents][Index]

A.1 Concepts

Jump to:   F   L  
Index Entry  Section

F
File, Lisp, lol-re.asd: The lol-re<dot>asd file
File, Lisp, lol-re/lol-re.lisp: The lol-re/lol-re<dot>lisp file
File, Lisp, lol-re/package.lisp: The lol-re/package<dot>lisp file
File, Lisp, lol-re/syntax.lisp: The lol-re/syntax<dot>lisp file

L
Lisp File, lol-re.asd: The lol-re<dot>asd file
Lisp File, lol-re/lol-re.lisp: The lol-re/lol-re<dot>lisp file
Lisp File, lol-re/package.lisp: The lol-re/package<dot>lisp file
Lisp File, lol-re/syntax.lisp: The lol-re/syntax<dot>lisp file
lol-re.asd: The lol-re<dot>asd file
lol-re/lol-re.lisp: The lol-re/lol-re<dot>lisp file
lol-re/package.lisp: The lol-re/package<dot>lisp file
lol-re/syntax.lisp: The lol-re/syntax<dot>lisp file

Jump to:   F   L  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.2 Functions

Jump to:   #  
B   C   D   E   F   I   L   M   R   S   W  
Index Entry  Section

#
#~-reader: Internal functions

B
bind-regs: Internal functions

C
clause-for-in-matches-of-using-1: Internal macros
clause-for-matching-using-2: Internal macros
clear-regs: Internal functions

D
define-reg-vars: Internal functions

E
ensure-correct-regex-spec: Internal macros

F
Function, #~-reader: Internal functions
Function, bind-regs: Internal functions
Function, clear-regs: Internal functions
Function, define-reg-vars: Internal functions
Function, list-o-syms: Internal functions
Function, lol-re-literal-string-reader: Internal functions
Function, lol-re-string-reader: Internal functions
Function, mk-scan-iter-code: Internal functions
Function, mk-scanner-create-code: Internal functions
Function, mods: Internal functions
Function, segment-reader: Internal functions
Function, string-reverse-case: Internal functions

I
ifmatch: Internal macros

L
list-o-syms: Internal functions
lol-re-literal-string-reader: Internal functions
lol-re-string-reader: Internal functions

M
Macro, clause-for-in-matches-of-using-1: Internal macros
Macro, clause-for-matching-using-2: Internal macros
Macro, ensure-correct-regex-spec: Internal macros
Macro, ifmatch: Internal macros
Macro, match-mode-ppcre-lambda-form: Internal macros
Macro, mr~: Exported macros
Macro, m~: Exported macros
Macro, re-local: Exported macros
Macro, subst-mode-ppcre-lambda-form: Internal macros
Macro, s~: Exported macros
Macro, whenmatch: Internal macros
Macro, with-re-reader-context: Internal macros
Macro, with-scanner: Internal macros
match-mode-ppcre-lambda-form: Internal macros
mk-scan-iter-code: Internal functions
mk-scanner-create-code: Internal functions
mods: Internal functions
mr~: Exported macros
m~: Exported macros

R
re-local: Exported macros

S
segment-reader: Internal functions
string-reverse-case: Internal functions
subst-mode-ppcre-lambda-form: Internal macros
s~: Exported macros

W
whenmatch: Internal macros
with-re-reader-context: Internal macros
with-scanner: Internal macros

Jump to:   #  
B   C   D   E   F   I   L   M   R   S   W  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.3 Variables

Jump to:   *  
R   S  
Index Entry  Section

*
*re-local-vars*: Internal special variables

R
regex: Internal symbol macros

S
Special Variable, *re-local-vars*: Internal special variables
Symbol Macro, regex: Internal symbol macros

Jump to:   *  
R   S  

Previous: , Up: Indexes   [Contents][Index]

A.4 Data types

Jump to:   L   P   S  
Index Entry  Section

L
lol-re: The lol-re system
lol-re: The lol-re package
lol-re.syntax: The lol-re<dot>syntax package

P
Package, lol-re: The lol-re package
Package, lol-re.syntax: The lol-re<dot>syntax package

S
System, lol-re: The lol-re system

Jump to:   L   P   S