Next: Introduction, Previous: (dir), Up: (dir) [Contents][Index]
This is the uax-15 Reference Manual, generated automatically by Declt version 3.0 "Montgomery Scott" on Tue Dec 22 15:23:07 2020 GMT+0.
• Introduction | What uax-15 is all about | |
• Systems | The systems documentation | |
• Modules | The modules documentation | |
• Files | The files documentation | |
• Packages | The packages documentation | |
• Definitions | The symbols documentation | |
• Indexes | Concepts, functions, variables and data types |
This package provides a common lisp unicode normalization function using nfc, nfd, nfkc and nfkd as per Unicode Standard Annex #15 found at http://www.unicode.org/reports/tr15/tr15-22.html.
This is a fork of a subset of work done by Takeru Ohta in 2010. Future work is intended to provide support for https://tools.ietf.org/html/rfc8264 and https://tools.ietf.org/html/rfc7564.
This has been successfully tested on sbcl, ccl, ecl, abcl, allegro and cmucl against the unicode test file found at http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt
Clisp still has some issues. It has not been tested against lispworks or other common lisp implementations.
It has one major exported function:
The currently supported normalization methods are :nfc :nfkc :nfd :nfkd
Normalization example with reference to relevant xkcd https://www.xkcd.com/936/
(normalize "正しい馬バッテリーステープル" :nfkc)
"正しい馬バッテリーステープル"
(normalize "الحصان الصحيح البطارية التيلة" :nfkc)
"الحصان الصحيح البطارية التيلة"
(normalize "اstáplacha ceart ceallraí capall" :nfkc)
"اstáplacha ceart ceallraí capall"
More relevant xkcd https://xkcd.com/1726/, https://xkcd.com/1953/, https://www.xkcd.com/1209/, https://xkcd.com/1137/
Next: Modules, Previous: Introduction, Up: Top [Contents][Index]
The main system appears first, followed by any subsystem dependency.
• The uax-15 system |
Takeru Ohta, Sabra Crolleton <sabra.crolleton@gmail.com>
MIT
Common lisp implementation of Unicode normalization functions :nfc, :nfd, :nfkc and :nfkd (Uax-15)
uax-15.asd (file)
src (module)
Modules are listed depth-first from the system components tree.
• The uax-15/src module |
uax-15 (system)
src/
Files are sorted by type and then listed depth-first from the systems components trees.
• Lisp files |
Next: The uax-15/src/package․lisp file, Previous: Lisp files, Up: Lisp files [Contents][Index]
uax-15.asd
uax-15 (system)
*string-file* (special variable)
Next: The uax-15/src/utilities․lisp file, Previous: The uax-15․asd file, Up: Lisp files [Contents][Index]
Next: The uax-15/src/trivial-utf-16․lisp file, Previous: The uax-15/src/package․lisp file, Up: Lisp files [Contents][Index]
package.lisp (file)
src (module)
src/utilities.lisp
Next: The uax-15/src/precomputed-tables․lisp file, Previous: The uax-15/src/utilities․lisp file, Up: Lisp files [Contents][Index]
package.lisp (file)
src (module)
src/trivial-utf-16.lisp
Next: The uax-15/src/normalize-backend․lisp file, Previous: The uax-15/src/trivial-utf-16․lisp file, Up: Lisp files [Contents][Index]
src (module)
src/precomputed-tables.lisp
Next: The uax-15/src/uax-15․lisp file, Previous: The uax-15/src/precomputed-tables․lisp file, Up: Lisp files [Contents][Index]
src (module)
src/normalize-backend.lisp
Previous: The uax-15/src/normalize-backend․lisp file, Up: Lisp files [Contents][Index]
src (module)
src/uax-15.lisp
Next: Definitions, Previous: Files, Up: Top [Contents][Index]
Packages are listed by definition order.
• The uax-15-system package | ||
• The uax-15 package |
Next: The uax-15 package, Previous: Packages, Up: Packages [Contents][Index]
uax-15.asd
*string-file* (special variable)
Previous: The uax-15-system package, Up: Packages [Contents][Index]
package.lisp (file)
common-lisp
Definitions are sorted by export status, category, package, and then by lexicographic order.
• Exported definitions | ||
• Internal definitions |
Next: Internal definitions, Previous: Definitions, Up: Definitions [Contents][Index]
• Exported functions | ||
• Exported types |
Next: Exported types, Previous: Exported definitions, Up: Exported definitions [Contents][Index]
Translate a Unicode code point to its UTF-16 representation. Returns a list of one or two codepoints. Passes surrogate code points straight through.
trivial-utf-16.lisp (file)
Take a vector of Unicode code points and turn it into a Lisp string.
trivial-utf-16.lisp (file)
uax-15.lisp (file)
Takes a normalization form, e.g. :nfkc and returns a list of lists of form (#NO-BREAK_SPACE NIL) where the first item is the character name and the second item has the value N or M or nil indicating whether the character may require renormalization.
uax-15.lisp (file)
Note no mapping for :nfkc
uax-15.lisp (file)
Base external function which calls the appropriate normalization for the normalization form. The default normaliation form is :nfkc, but :nfd, :nfkd and :nfc are also available.
uax-15.lisp (file)
Translate a pair of surrogate codepoints to a non-BMP codepoint. Returns the codepoint as an integer.
trivial-utf-16.lisp (file)
Take a Lisp string and turn it into a vector of Unicode code points.
trivial-utf-16.lisp (file)
Previous: Exported functions, Up: Exported definitions [Contents][Index]
A vector of Unicode code points.
trivial-utf-16.lisp (file)
Previous: Exported definitions, Up: Definitions [Contents][Index]
• Internal special variables | ||
• Internal macros | ||
• Internal functions | ||
• Internal generic functions | ||
• Internal conditions | ||
• Internal types |
Next: Internal macros, Previous: Internal definitions, Up: Internal definitions [Contents][Index]
precomputed-tables.lisp (file)
precomputed-tables.lisp (file)
precomputed-tables.lisp (file)
precomputed-tables.lisp (file)
precomputed-tables.lisp (file)
precomputed-tables.lisp (file)
uax-15.lisp (file)
uax-15.lisp (file)
uax-15.asd
precomputed-tables.lisp (file)
Next: Internal functions, Previous: Internal special variables, Up: Internal definitions [Contents][Index]
utilities.lisp (file)
Next: Internal generic functions, Previous: Internal macros, Up: Internal definitions [Contents][Index]
utilities.lisp (file)
normalize-backend.lisp (file)
normalize-backend.lisp (file)
normalize-backend.lisp (file)
Turn a vector of UTF-16 code units into a vector of Unicode code points. Passes unpaired surrogate codepoints straight through.
trivial-utf-16.lisp (file)
normalize-backend.lisp (file)
normalize-backend.lisp (file)
normalize-backend.lisp (file)
Turn a vector of Unicode code points into a vector of UTF-16 code units. Indifferent to unpaired surrogates.
trivial-utf-16.lisp (file)
normalize-backend.lisp (file)
utilities.lisp (file)
normalize-backend.lisp (file)
normalize-backend.lisp (file)
normalize-backend.lisp (file)
normalize-backend.lisp (file)
Runs normalize on a single character input and returns a single character string. You must provide the normalization form (:nfd, :nfkd, :nfc, or :nfkc)
uax-15.lisp (file)
Takes a list of numbers and returns a string of characters
utilities.lisp (file)
Parse a hex string which is a single character into a character using code-char.
utilities.lisp (file)
Parse a string which is a single character in hex to a decimal.
utilities.lisp (file)
Takes a string which may be one or more hex numbers e.g. ’0044 0307’, builds an array of characters, coerces to string and returns the string. Mostly used for testing.
utilities.lisp (file)
Next: Internal conditions, Previous: Internal functions, Up: Internal definitions [Contents][Index]
utilities.lisp (file)
utilities.lisp (file)
utilities.lisp (file)
Next: Internal types, Previous: Internal generic functions, Up: Internal definitions [Contents][Index]
utilities.lisp (file)
error (condition)
Text message indicating what went wrong with the validation.
:message
(quote nil)
bad-char-error-message (generic function)
(setf bad-char-error-message) (generic function)
The value of the field for which the error is signalled.
:value
(quote nil)
bad-char-error-value (generic function)
(setf bad-char-error-value) (generic function)
The normalization form for the error was signalled.
:normalization-form
(quote nil)
bad-char-error-normalization-form (generic function)
(setf bad-char-error-normalization-form) (generic function)
Previous: Internal conditions, Up: Internal definitions [Contents][Index]
A Unicode High Surrogate.
trivial-utf-16.lisp (file)
A Unicode Low Surrogate.
trivial-utf-16.lisp (file)
A Unicode code point.
trivial-utf-16.lisp (file)
Previous: Definitions, Up: Top [Contents][Index]
• Concept index | ||
• Function index | ||
• Variable index | ||
• Data type index |
Next: Function index, Previous: Indexes, Up: Indexes [Contents][Index]
Jump to: | F L M U |
---|
Jump to: | F L M U |
---|
Next: Variable index, Previous: Concept index, Up: Indexes [Contents][Index]
Jump to: | (
B C D E F G I M N P S T |
---|
Jump to: | (
B C D E F G I M N P S T |
---|
Next: Data type index, Previous: Function index, Up: Indexes [Contents][Index]
Jump to: | *
M N S V |
---|
Jump to: | *
M N S V |
---|
Previous: Variable index, Up: Indexes [Contents][Index]
Jump to: | B C H L P S T U |
---|
Jump to: | B C H L P S T U |
---|