The cl-lexer Reference Manual

Next: , Previous: , Up: (dir)   [Contents][Index]

The cl-lexer Reference Manual

This is the cl-lexer Reference Manual, version 1.4, generated automatically by Declt version 4.0 beta 2 "William Riker" on Thu Sep 15 03:54:32 2022 GMT+0.

Table of Contents


1 Introduction

CL-LEXER package

The CL-LEXER package implements a lexical-analyzer-generator called DEFLEXER,
which is built on top of both REGEX and CLAWK. Many of the optimizations in the
recent rewrite of the regex engine went into optimizing the sorts of patterns
generated by DEFLEX.

The default lexer doesn't implement full greediness. If you have a rule for
ints followed by a rule for floats, the int rule will match on the part before
the decimal before the float rule gets a chance to look at it. You can fix this
by specifying :flex-compatible as the first rule. This gives all patterns a
chance to examine the text and takes the one that matches the longest string
(first pattern wins in case of a tie). The down side of this option is that it
slows down the analyser. If you can solve the issue by reordering your rules
that's the way to do it.

I'm currently writing an AWK->CLAWK translator using this as the lexer, and
it's working fine. As far as I can tell, the DEFLEXER-generated lexing
functions should be fast enough for production use.

Currently, the LEX/FLEX/BISON feature of switching productions on and off using
state variables is not supported, but it's a pretty simple feature to add. If
you're using CL-LEXER and discover you need this feature, let me know.

It also doesn't yet support prefix and postfix context patterns. This isn't
quite so trivial to add, but it's planned for a future release of regex, so
CL-LEXER will be getting it someday.

Anyway, Here's a simple DEFLEXER example:

  (deflexer test-lexer
    ("[0-9]+([.][0-9]+([Ee][0-9]+)?)"
      (return (values 'flt (num %0))))
    ("[0-9]+"
      (return (values 'int (int %0))))
    ("[:alpha:][:alnum:]*"
      (return (values 'name %0)))
    ("[:space:]+") )

  > (setq *lex* (test-lexer "1.0 12 fred 10.23e45"))
  
 
  > (funcall *lex*)
  FLT
  1.0
 
  > (funcall *lex*)
  INT
  12
 
  > (funcall *lex*)
  NAME
  "fred"

  > (funcall *lex*)
  FLT
  1.0229999999999997E46

  > (funcall *lex*)
  NIL
  NIL

You can also write this lexer using the :flex-compatible option, in which case
you can write the int and flt rules in any order.

(deflexer test-lexer
  :flex-compatible
  ("[0-9]+"
    (return (values 'int (int %0))))
  ("[0-9]+([.][0-9]+([Ee][0-9]+)?)"
    (return (values 'flt (num %0))))
  ("[:space:]+")
 )



2 Systems

The main system appears first, followed by any subsystem dependency.


Previous: , Up: Systems   [Contents][Index]

2.1 cl-lexer

cl-lexer: a lexical analyzer generator

Author

Kenneth Michael Parker

License

BSD-new

Version

1.4

Dependency

regex (system).

Source

cl-lexer.asd.

Child Components

3 Files

Files are sorted by type and then listed depth-first from the systems components trees.


Previous: , Up: Files   [Contents][Index]

3.1 Lisp


Next: , Previous: , Up: Lisp   [Contents][Index]

3.1.1 cl-lexer/cl-lexer.asd

Source

cl-lexer.asd.

Parent Component

cl-lexer (system).

ASDF Systems

cl-lexer.


3.1.2 cl-lexer/packages.lisp

Source

cl-lexer.asd.

Parent Component

cl-lexer (system).

Packages

cl-lexer.


Previous: , Up: Lisp   [Contents][Index]

3.1.3 cl-lexer/lexer.lisp

Source

cl-lexer.asd.

Parent Component

cl-lexer (system).

Public Interface
Internals

4 Packages

Packages are listed by definition order.


Previous: , Up: Packages   [Contents][Index]

4.1 cl-lexer

Source

packages.lisp.

Use List
  • common-lisp.
  • regex.
Public Interface
Internals

5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


Next: , Previous: , Up: Definitions   [Contents][Index]

5.1 Public Interface


5.1.1 Macros

Macro: deflexer (name &rest rules)

Create a lexical analyser. This analyser function takes a string :position, :end, :end-token and :end-value keyword parameters, and returns a function of no arguments that returns the next token and value each time it is called, or (values end-token end-value) when the input string is exhausted.
By default, position = 0, end = length of str, and end-token and end-value = nil.

Package

cl-lexer.

Source

lexer.lisp.

Macro: tokenize (str pos-var &rest rules)
Package

cl-lexer.

Source

lexer.lisp.


Previous: , Up: Public Interface   [Contents][Index]

5.1.2 Generic functions

Generic Function: int (x)
Package

cl-lexer.

Source

lexer.lisp.

Methods
Method: int ((x string))
Method: int ((x (eql nil)))
Method: int ((x number))
Method: int ((x integer))
Generic Function: num (x)
Package

cl-lexer.

Source

lexer.lisp.

Methods
Method: num ((x string))
Method: num ((x (eql nil)))
Method: num ((x number))

5.2 Internals


Previous: , Up: Internals   [Contents][Index]

5.2.1 Ordinary functions

Function: combine-patterns (pats)
Package

cl-lexer.

Source

lexer.lisp.

Function: expand-tokenize (str pos rules)
Package

cl-lexer.

Source

lexer.lisp.

Function: expand-tokenize-rules (str pos matchedp rules)
Package

cl-lexer.

Source

lexer.lisp.

Function: extract-patterns-and-actions (rules)
Package

cl-lexer.

Source

lexer.lisp.

Function: make-lexer-actions (actions)
Package

cl-lexer.

Source

lexer.lisp.


Appendix A Indexes


Next: , Previous: , Up: Indexes   [Contents][Index]

A.1 Concepts


Next: , Previous: , Up: Indexes   [Contents][Index]

A.2 Functions

Jump to:   C   D   E   F   G   I   M   N   T  
Index Entry  Section

C
combine-patterns: Private ordinary functions

D
deflexer: Public macros

E
expand-tokenize: Private ordinary functions
expand-tokenize-rules: Private ordinary functions
extract-patterns-and-actions: Private ordinary functions

F
Function, combine-patterns: Private ordinary functions
Function, expand-tokenize: Private ordinary functions
Function, expand-tokenize-rules: Private ordinary functions
Function, extract-patterns-and-actions: Private ordinary functions
Function, make-lexer-actions: Private ordinary functions

G
Generic Function, int: Public generic functions
Generic Function, num: Public generic functions

I
int: Public generic functions
int: Public generic functions
int: Public generic functions
int: Public generic functions
int: Public generic functions

M
Macro, deflexer: Public macros
Macro, tokenize: Public macros
make-lexer-actions: Private ordinary functions
Method, int: Public generic functions
Method, int: Public generic functions
Method, int: Public generic functions
Method, int: Public generic functions
Method, num: Public generic functions
Method, num: Public generic functions
Method, num: Public generic functions

N
num: Public generic functions
num: Public generic functions
num: Public generic functions
num: Public generic functions

T
tokenize: Public macros

Jump to:   C   D   E   F   G   I   M   N   T  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.3 Variables