The uax-15 Reference Manual

This is the uax-15 Reference Manual, version 0.1.3, generated automatically by Declt version 4.0 beta 2 "William Riker" on Sun Dec 15 07:59:03 2024 GMT+0.

Table of Contents


1 Introduction


2 Systems

The main system appears first, followed by any subsystem dependency.


2.1 uax-15

Common lisp implementation of Unicode normalization functions :nfc, :nfd, :nfkc and :nfkd (Uax-15)

Author

Takeru Ohta, Sabra Crolleton <>

License

MIT

Version

0.1.3

Dependencies
  • split-sequence (system).
  • cl-ppcre (system).
Source

uax-15.asd.

Child Component

src (module).


3 Modules

Modules are listed depth-first from the system components tree.


3.1 uax-15/src

Source

uax-15.asd.

Parent Component

uax-15 (system).

Child Components

4 Files

Files are sorted by type and then listed depth-first from the systems components trees.


4.1 Lisp


4.1.1 uax-15/uax-15.asd

Source

uax-15.asd.

Parent Component

uax-15 (system).

ASDF Systems

uax-15.


4.1.2 uax-15/src/package.lisp

Source

uax-15.asd.

Parent Component

src (module).

Packages

uax-15.


4.1.3 uax-15/src/utilities.lisp

Dependency

package.lisp (file).

Source

uax-15.asd.

Parent Component

src (module).

Public Interface

print-object (method).

Internals

4.1.4 uax-15/src/trivial-utf-16.lisp

Dependency

package.lisp (file).

Source

uax-15.asd.

Parent Component

src (module).

Public Interface
Internals

4.1.5 uax-15/src/precomputed-tables.lisp

Dependencies
Source

uax-15.asd.

Parent Component

src (module).

Internals

4.1.6 uax-15/src/normalize-backend.lisp

Dependencies
Source

uax-15.asd.

Parent Component

src (module).

Internals

4.1.7 uax-15/src/uax-15.lisp

Dependencies
Source

uax-15.asd.

Parent Component

src (module).

Public Interface
Internals

5 Packages

Packages are listed by definition order.


5.1 uax-15

Source

package.lisp.

Use List

common-lisp.

Public Interface
Internals

6 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


6.1 Public Interface


6.1.1 Ordinary functions

Function: codepoint-as-utf-16 (codepoint)

Translate a Unicode code point to its UTF-16 representation. Returns a list of one or two codepoints. Passes surrogate code points straight through.

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: from-unicode-string (unicode-string)

Take a vector of Unicode code points and turn it into a Lisp string.

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: get-canonical-combining-class-map ()
Package

uax-15.

Source

uax-15.lisp.

Function: get-illegal-char-list (normalization-form)

Takes a normalization form, e.g. :nfkc and returns a list of lists of form (#NO-BREAK_SPACE NIL) where the first item is the character name and the second item has the value N or M or nil indicating whether the character may require renormalization.

Package

uax-15.

Source

uax-15.lisp.

Function: get-mapping (normalization-form)

Note no mapping for :nfkc

Package

uax-15.

Source

uax-15.lisp.

Function: normalize (str normalization-form &key rfc)

Base external function which calls the appropriate normalization for the normalization form. The default normaliation form is :nfkc, but :nfd, :nfkd and :nfc are also available.

Package

uax-15.

Source

uax-15.lisp.

Function: surrogates-to-codepoint (high-surrogate low-surrogate)

Translate a pair of surrogate codepoints to a non-BMP codepoint. Returns the codepoint as an integer.

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: to-unicode-string (lisp-string)

Take a Lisp string and turn it into a vector of Unicode code points.

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: unicode-letter-p (char)

Returns T if the character is one of the unicode characters falling into a letter category: uppercase, lowercase, titlecase, modifier and other.

Package

uax-15.

Source

uax-15.lisp.


6.1.2 Standalone methods

Method: print-object ((object bad-char-error) stream)
Source

utilities.lisp.


6.1.3 Types

Type: unicode-string ()

A vector of Unicode code points.

Package

uax-15.

Source

trivial-utf-16.lisp.


6.2 Internals


6.2.1 Special variables

Special Variable: *canonical-combining-class*
Package

uax-15.

Source

precomputed-tables.lisp.

Special Variable: *canonical-comp-map*
Package

uax-15.

Source

precomputed-tables.lisp.

Special Variable: *canonical-decomp-map*
Package

uax-15.

Source

precomputed-tables.lisp.

Special Variable: *compatible-decomp-map*
Package

uax-15.

Source

precomputed-tables.lisp.

Special Variable: *composition-exclusions-data*
Package

uax-15.

Source

precomputed-tables.lisp.

Special Variable: *data-directory*
Package

uax-15.

Source

precomputed-tables.lisp.

Special Variable: *derived-normalization-props-data*
Package

uax-15.

Source

uax-15.lisp.

Special Variable: *derived-normalization-props-data-file*
Package

uax-15.

Source

uax-15.lisp.

Special Variable: *unicode-data*
Package

uax-15.

Source

precomputed-tables.lisp.

Special Variable: *unicode-letters*
Package

uax-15.

Source

precomputed-tables.lisp.


6.2.2 Macros

Macro: nconcf (list1 list2)
Package

uax-15.

Source

utilities.lisp.


6.2.3 Ordinary functions

Function: bad-char-error (message &key value normalization-form)
Package

uax-15.

Source

utilities.lisp.

Function: canonical-ordering (decomposed-string)
Package

uax-15.

Source

normalize-backend.lisp.

Function: char-from-hexstring (hexpoint-str)

Translating the first char from *unicode-data* hex codepoint string to a lisp character.

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: codepoint-to-unicode-point (int)

Translates an integer to a unicode-point

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: compose (decomposed-string)
Package

uax-15.

Source

normalize-backend.lisp.

Function: compose-hangul (str)
Package

uax-15.

Source

normalize-backend.lisp.

Function: decode-utf-16 (utf-16-string)

Turn a vector of UTF-16 code units into a vector of Unicode code points. Passes unpaired surrogate codepoints straight through.

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: decompose (s type)
Package

uax-15.

Source

normalize-backend.lisp.

Function: decompose-char (char &optional type)
Package

uax-15.

Source

normalize-backend.lisp.

Function: decompose-hangul-char (ch)
Package

uax-15.

Source

normalize-backend.lisp.

Function: encode-utf-16 (unicode-string)

Turn a vector of Unicode code points into a vector of UTF-16 code units. Indifferent to unpaired surrogates.

Package

uax-15.

Source

trivial-utf-16.lisp.

Function: get-canonical-combining-class (ch)
Package

uax-15.

Source

normalize-backend.lisp.

Function: int-to-hex-string (int)
Package

uax-15.

Source

utilities.lisp.

Function: nfc (s)
Package

uax-15.

Source

normalize-backend.lisp.

Function: nfd (s)
Package

uax-15.

Source

normalize-backend.lisp.

Function: nfkc (s)
Package

uax-15.

Source

normalize-backend.lisp.

Function: nfkd (s)
Package

uax-15.

Source

normalize-backend.lisp.

Function: normalize-char (chr normalization-form)

Runs normalize on a single character input and returns a single character string. You must provide the normalization form (:nfd, :nfkd, :nfc, or :nfkc)

Package

uax-15.

Source

uax-15.lisp.

Function: parse-hex-list-to-string (lst)

Takes a list of numbers and returns a string of characters

Package

uax-15.

Source

utilities.lisp.

Function: parse-hex-string-to-char (str)

Parse a hex string which is a single character into a character using code-char.

Package

uax-15.

Source

utilities.lisp.

Function: parse-hex-string-to-int (str)

Parse a string which is a single character in hex to a decimal.

Package

uax-15.

Source

utilities.lisp.

Function: parse-hex-string-to-string (str)

Takes a string which may be one or more hex numbers e.g. ’0044 0307’, builds an array of characters, coerces to string and returns the string. Mostly used for testing.

Package

uax-15.

Source

utilities.lisp.

Function: unicode-point-p (p)
Package

uax-15.

Source

trivial-utf-16.lisp.


6.2.4 Generic functions

Generic Reader: bad-char-error-message (condition)
Generic Writer: (setf bad-char-error-message) (condition)
Package

uax-15.

Methods
Reader Method: bad-char-error-message ((condition bad-char-error))
Writer Method: (setf bad-char-error-message) ((condition bad-char-error))
Source

utilities.lisp.

Target Slot

message.

Generic Reader: bad-char-error-normalization-form (condition)
Generic Writer: (setf bad-char-error-normalization-form) (condition)
Package

uax-15.

Methods
Reader Method: bad-char-error-normalization-form ((condition bad-char-error))
Writer Method: (setf bad-char-error-normalization-form) ((condition bad-char-error))
Source

utilities.lisp.

Target Slot

normalization-form.

Generic Reader: bad-char-error-value (condition)
Generic Writer: (setf bad-char-error-value) (condition)
Package

uax-15.

Methods
Reader Method: bad-char-error-value ((condition bad-char-error))
Writer Method: (setf bad-char-error-value) ((condition bad-char-error))
Source

utilities.lisp.

Target Slot

value.


6.2.5 Conditions

Condition: bad-char-error
Package

uax-15.

Source

utilities.lisp.

Direct superclasses

error.

Direct methods
Direct slots
Slot: message

Text message indicating what went wrong with the validation.

Initform

(quote nil)

Initargs

:message

Readers

bad-char-error-message.

Writers

(setf bad-char-error-message).

Slot: value

The value of the field for which the error is signalled.

Initform

(quote nil)

Initargs

:value

Readers

bad-char-error-value.

Writers

(setf bad-char-error-value).

Slot: normalization-form

The normalization form for the error was signalled.

Initform

(quote nil)

Initargs

:normalization-form

Readers

bad-char-error-normalization-form.

Writers

(setf bad-char-error-normalization-form).


6.2.6 Types

Type: high-surrogate ()

A Unicode High Surrogate.

Package

uax-15.

Source

trivial-utf-16.lisp.

Type: low-surrogate ()

A Unicode Low Surrogate.

Package

uax-15.

Source

trivial-utf-16.lisp.

Type: unicode-point ()

A Unicode code point.

Package

uax-15.

Source

trivial-utf-16.lisp.


Appendix A Indexes


A.1 Concepts


A.2 Functions

Jump to:   (  
B   C   D   E   F   G   I   M   N   P   S   T   U  
Index Entry  Section

(
(setf bad-char-error-message): Private generic functions
(setf bad-char-error-message): Private generic functions
(setf bad-char-error-normalization-form): Private generic functions
(setf bad-char-error-normalization-form): Private generic functions
(setf bad-char-error-value): Private generic functions
(setf bad-char-error-value): Private generic functions

B
bad-char-error: Private ordinary functions
bad-char-error-message: Private generic functions
bad-char-error-message: Private generic functions
bad-char-error-normalization-form: Private generic functions
bad-char-error-normalization-form: Private generic functions
bad-char-error-value: Private generic functions
bad-char-error-value: Private generic functions

C
canonical-ordering: Private ordinary functions
char-from-hexstring: Private ordinary functions
codepoint-as-utf-16: Public ordinary functions
codepoint-to-unicode-point: Private ordinary functions
compose: Private ordinary functions
compose-hangul: Private ordinary functions

D
decode-utf-16: Private ordinary functions
decompose: Private ordinary functions
decompose-char: Private ordinary functions
decompose-hangul-char: Private ordinary functions

E
encode-utf-16: Private ordinary functions

F
from-unicode-string: Public ordinary functions
Function, bad-char-error: Private ordinary functions
Function, canonical-ordering: Private ordinary functions
Function, char-from-hexstring: Private ordinary functions
Function, codepoint-as-utf-16: Public ordinary functions
Function, codepoint-to-unicode-point: Private ordinary functions
Function, compose: Private ordinary functions
Function, compose-hangul: Private ordinary functions
Function, decode-utf-16: Private ordinary functions
Function, decompose: Private ordinary functions
Function, decompose-char: Private ordinary functions
Function, decompose-hangul-char: Private ordinary functions
Function, encode-utf-16: Private ordinary functions
Function, from-unicode-string: Public ordinary functions
Function, get-canonical-combining-class: Private ordinary functions
Function, get-canonical-combining-class-map: Public ordinary functions
Function, get-illegal-char-list: Public ordinary functions
Function, get-mapping: Public ordinary functions
Function, int-to-hex-string: Private ordinary functions
Function, nfc: Private ordinary functions
Function, nfd: Private ordinary functions
Function, nfkc: Private ordinary functions
Function, nfkd: Private ordinary functions
Function, normalize: Public ordinary functions
Function, normalize-char: Private ordinary functions
Function, parse-hex-list-to-string: Private ordinary functions
Function, parse-hex-string-to-char: Private ordinary functions
Function, parse-hex-string-to-int: Private ordinary functions
Function, parse-hex-string-to-string: Private ordinary functions
Function, surrogates-to-codepoint: Public ordinary functions
Function, to-unicode-string: Public ordinary functions
Function, unicode-letter-p: Public ordinary functions
Function, unicode-point-p: Private ordinary functions

G
Generic Function, (setf bad-char-error-message): Private generic functions
Generic Function, (setf bad-char-error-normalization-form): Private generic functions
Generic Function, (setf bad-char-error-value): Private generic functions
Generic Function, bad-char-error-message: Private generic functions
Generic Function, bad-char-error-normalization-form: Private generic functions
Generic Function, bad-char-error-value: Private generic functions
get-canonical-combining-class: Private ordinary functions
get-canonical-combining-class-map: Public ordinary functions
get-illegal-char-list: Public ordinary functions
get-mapping: Public ordinary functions

I
int-to-hex-string: Private ordinary functions

M
Macro, nconcf: Private macros
Method, (setf bad-char-error-message): Private generic functions
Method, (setf bad-char-error-normalization-form): Private generic functions
Method, (setf bad-char-error-value): Private generic functions
Method, bad-char-error-message: Private generic functions
Method, bad-char-error-normalization-form: Private generic functions
Method, bad-char-error-value: Private generic functions
Method, print-object: Public standalone methods

N
nconcf: Private macros
nfc: Private ordinary functions
nfd: Private ordinary functions
nfkc: Private ordinary functions
nfkd: Private ordinary functions
normalize: Public ordinary functions
normalize-char: Private ordinary functions

P
parse-hex-list-to-string: Private ordinary functions
parse-hex-string-to-char: Private ordinary functions
parse-hex-string-to-int: Private ordinary functions
parse-hex-string-to-string: Private ordinary functions
print-object: Public standalone methods

S
surrogates-to-codepoint: Public ordinary functions

T
to-unicode-string: Public ordinary functions

U
unicode-letter-p: Public ordinary functions
unicode-point-p: Private ordinary functions


A.3 Variables

Jump to:   *  
M   N   S   V  
Index Entry  Section

*
*canonical-combining-class*: Private special variables
*canonical-comp-map*: Private special variables
*canonical-decomp-map*: Private special variables
*compatible-decomp-map*: Private special variables
*composition-exclusions-data*: Private special variables
*data-directory*: Private special variables
*derived-normalization-props-data*: Private special variables
*derived-normalization-props-data-file*: Private special variables
*unicode-data*: Private special variables
*unicode-letters*: Private special variables

M
message: Private conditions

N
normalization-form: Private conditions

S
Slot, message: Private conditions
Slot, normalization-form: Private conditions
Slot, value: Private conditions
Special Variable, *canonical-combining-class*: Private special variables
Special Variable, *canonical-comp-map*: Private special variables
Special Variable, *canonical-decomp-map*: Private special variables
Special Variable, *compatible-decomp-map*: Private special variables
Special Variable, *composition-exclusions-data*: Private special variables
Special Variable, *data-directory*: Private special variables
Special Variable, *derived-normalization-props-data*: Private special variables
Special Variable, *derived-normalization-props-data-file*: Private special variables
Special Variable, *unicode-data*: Private special variables
Special Variable, *unicode-letters*: Private special variables

V
value: Private conditions


A.4 Data types

Jump to:   B   C   F   H   L   M   N   P   S   T   U  
Index Entry  Section

B
bad-char-error: Private conditions

C
Condition, bad-char-error: Private conditions

F
File, normalize-backend.lisp: The uax-15/src/normalize-backend․lisp file
File, package.lisp: The uax-15/src/package․lisp file
File, precomputed-tables.lisp: The uax-15/src/precomputed-tables․lisp file
File, trivial-utf-16.lisp: The uax-15/src/trivial-utf-16․lisp file
File, uax-15.asd: The uax-15/uax-15․asd file
File, uax-15.lisp: The uax-15/src/uax-15․lisp file
File, utilities.lisp: The uax-15/src/utilities․lisp file

H
high-surrogate: Private types

L
low-surrogate: Private types

M
Module, src: The uax-15/src module

N
normalize-backend.lisp: The uax-15/src/normalize-backend․lisp file

P
Package, uax-15: The uax-15 package
package.lisp: The uax-15/src/package․lisp file
precomputed-tables.lisp: The uax-15/src/precomputed-tables․lisp file

S
src: The uax-15/src module
System, uax-15: The uax-15 system

T
trivial-utf-16.lisp: The uax-15/src/trivial-utf-16․lisp file
Type, high-surrogate: Private types
Type, low-surrogate: Private types
Type, unicode-point: Private types
Type, unicode-string: Public types

U
uax-15: The uax-15 system
uax-15: The uax-15 package
uax-15.asd: The uax-15/uax-15․asd file
uax-15.lisp: The uax-15/src/uax-15․lisp file
unicode-point: Private types
unicode-string: Public types
utilities.lisp: The uax-15/src/utilities․lisp file