The trivial-utf-8 Reference Manual

Table of Contents

Next: , Previous: , Up: (dir)   [Contents][Index]

The trivial-utf-8 Reference Manual

This is the trivial-utf-8 Reference Manual, generated automatically by Declt version 3.0 "Montgomery Scott" on Tue Dec 22 15:21:54 2020 GMT+0.


Next: , Previous: , Up: Top   [Contents][Index]

1 Introduction

# Trivial UTF-8 Manual

###### \[in package TRIVIAL-UTF-8\]
## trivial-utf-8 ASDF System Details

- Description: A small library for doing UTF-8-based input and output.
- Licence: ZLIB
- Author: Marijn Haverbeke 
- Maintainer: Gábor Melis 
- Homepage: [https://common-lisp.net/project/trivial-utf-8/](https://common-lisp.net/project/trivial-utf-8/)
- Bug tracker: [https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8/-/issues](https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8/-/issues)
- Source control: [GIT](https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8.git)

## Introduction

Trivial UTF-8 is a small library for doing UTF-8-based in- and
output on a Lisp implementation that already supports Unicode -
meaning CHAR-CODE and CODE-CHAR deal with Unicode character codes.

The rationale for the existence of this library is that while
Unicode-enabled implementations usually do provide some kind of
interface to dealing with character encodings, these are typically
not terribly flexible or uniform.

The [Babel][babel] library solves a similar problem while
understanding more encodings. Trivial UTF-8 was written before Babel
existed, but for new projects you might be better off going with
Babel. The one plus that Trivial UTF-8 has is that it doesn't depend
on any other libraries.

[babel]: https://common-lisp.net/project/babel/ 


## Links

Here is the [official repository][trivial-utf-8-repo] and the
[HTML documentation][trivial-utf-8-doc] for the latest version.

[trivial-utf-8-repo]: https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8 

[trivial-utf-8-doc]: http://melisgl.github.io/mgl-pax-world/trivial-utf-8-manual.html 


## Reference

- [function] UTF-8-BYTE-LENGTH STRING

    Calculate the amount of bytes needed to encode STRING.

- [function] STRING-TO-UTF-8-BYTES STRING &KEY NULL-TERMINATE

    Convert STRING into an array of unsigned bytes containing its UTF-8
    representation. If NULL-TERMINATE, add an extra 0 byte at the end.

- [function] UTF-8-GROUP-SIZE BYTE

    Determine the amount of bytes that are part of the character whose
    encoding starts with BYTE. May signal UTF-8-DECODING-ERROR.

- [function] UTF-8-BYTES-TO-STRING BYTES &KEY (START 0) (END (LENGTH BYTES))

    Convert the START, END subsequence of the array of BYTES containing
    UTF-8 encoded characters to a STRING. The element type of BYTES may
    be anything as long as it can be `COERCE`d into an `(UNSIGNED-BYTES
    8)` array. May signal UTF-8-DECODING-ERROR.

- [function] READ-UTF-8-STRING INPUT &KEY NULL-TERMINATED STOP-AT-EOF (CHAR-LENGTH -1) (BYTE-LENGTH -1)

    Read UTF-8 encoded data from INPUT, a byte stream, and construct a
    string with the characters found. When NULL-TERMINATED is given,
    stop reading at a null character. If STOP-AT-EOF, then stop at
    END-OF-FILE without raising an error. The CHAR-LENGTH and
    BYTE-LENGTH parameters can be used to specify the max amount of
    characters or bytes to read, where -1 means no limit. May signal
    UTF-8-DECODING-ERROR.

- [condition] UTF-8-DECODING-ERROR SIMPLE-ERROR

* * *
###### \[generated by [MGL-PAX](https://github.com/melisgl/mgl-pax)\]


Next: , Previous: , Up: Top   [Contents][Index]

2 Systems

The main system appears first, followed by any subsystem dependency.


Previous: , Up: Systems   [Contents][Index]

2.1 trivial-utf-8

Maintainer

Gábor Melis <mega@retes.hu>

Author

Marijn Haverbeke <marijnh@gmail.com>

Home Page

https://common-lisp.net/project/trivial-utf-8/

Source Control

(:git "https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8.git")

Bug Tracker

https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8/-/issues

License

ZLIB

Description

A small library for doing UTF-8-based input and output.

Source

trivial-utf-8.asd (file)

Component

trivial-utf-8.lisp (file)


Next: , Previous: , Up: Top   [Contents][Index]

3 Files

Files are sorted by type and then listed depth-first from the systems components trees.


Previous: , Up: Files   [Contents][Index]

3.1 Lisp


Next: , Previous: , Up: Lisp files   [Contents][Index]

3.1.1 trivial-utf-8.asd

Location

trivial-utf-8.asd

Systems

trivial-utf-8 (system)


Previous: , Up: Lisp files   [Contents][Index]

3.1.2 trivial-utf-8/trivial-utf-8.lisp

Parent

trivial-utf-8 (system)

Location

trivial-utf-8.lisp

Packages

trivial-utf-8

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

4 Packages

Packages are listed by definition order.


Previous: , Up: Packages   [Contents][Index]

4.1 trivial-utf-8

Source

trivial-utf-8.lisp (file)

Use List

common-lisp

Exported Definitions
Internal Definitions

Next: , Previous: , Up: Top   [Contents][Index]

5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


Next: , Previous: , Up: Definitions   [Contents][Index]

5.1 Exported definitions


Next: , Previous: , Up: Exported definitions   [Contents][Index]

5.1.1 Functions

Function: read-utf-8-string INPUT &key NULL-TERMINATED STOP-AT-EOF CHAR-LENGTH BYTE-LENGTH

Read UTF-8 encoded data from INPUT, a byte stream, and construct a string with the characters found. When NULL-TERMINATED is given, stop reading at a null character. If STOP-AT-EOF, then stop at END-OF-FILE without raising an error. The CHAR-LENGTH and BYTE-LENGTH parameters can be used to specify the max amount of characters or bytes to read, where -1 means no limit. May signal UTF-8-DECODING-ERROR.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)

Function: string-to-utf-8-bytes STRING &key NULL-TERMINATE

Convert STRING into an array of unsigned bytes containing its UTF-8 representation. If NULL-TERMINATE, add an extra 0 byte at the end.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)

Function: utf-8-byte-length STRING

Calculate the amount of bytes needed to encode STRING.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)

Function: utf-8-bytes-to-string BYTES &key START END

Convert the START, END subsequence of the array of BYTES containing UTF-8 encoded characters to a STRING. The element type of BYTES may be anything as long as it can be ‘COERCE‘d into an ‘(UNSIGNED-BYTES 8)‘ array. May signal UTF-8-DECODING-ERROR.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)

Function: utf-8-group-size BYTE

Determine the amount of bytes that are part of the character whose encoding starts with BYTE. May signal UTF-8-DECODING-ERROR.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)

Function: write-utf-8-bytes STRING BYTE-STREAM &key NULL-TERMINATE

Write STRING to BYTE-STREAM, encoding it as UTF-8. If NULL-TERMINATE, write an extra 0 byte at the end.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)


Previous: , Up: Exported definitions   [Contents][Index]

5.1.2 Conditions

Condition: utf-8-decoding-error ()
Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)

Direct superclasses

simple-error (condition)

Direct slots
Slot: message
Initargs

:message

Slot: byte
Initargs

:byte

Initform

(quote nil)


Previous: , Up: Definitions   [Contents][Index]

5.2 Internal definitions


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.1 Special variables

Special Variable: *optimize*
Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)


Next: , Previous: , Up: Internal definitions   [Contents][Index]

5.2.2 Macros

Macro: as-utf-8-bytes CHAR WRITER

Given the character CHAR, call the WRITER for every byte in the UTF-8 encoded form of that character. WRITER may a function or a macro.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)


Previous: , Up: Internal definitions   [Contents][Index]

5.2.3 Functions

Function: get-utf-8-character BYTES GROUP-SIZE &optional START

Extract the character from an array of BYTES encoded in GROUP-SIZE number of bytes from the START position. May signal UTF-8-DECODING-ERROR.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)

Function: utf-8-string-length BYTES &key START END

Calculate the length of the string encoded by the subsequence of the array of BYTES bounded by START and END.

Package

trivial-utf-8

Source

trivial-utf-8.lisp (file)


Previous: , Up: Top   [Contents][Index]

Appendix A Indexes


Next: , Previous: , Up: Indexes   [Contents][Index]

A.1 Concepts

Jump to:   F   L   T  
Index Entry  Section

F
File, Lisp, trivial-utf-8.asd: The trivial-utf-8․asd file
File, Lisp, trivial-utf-8/trivial-utf-8.lisp: The trivial-utf-8/trivial-utf-8․lisp file

L
Lisp File, trivial-utf-8.asd: The trivial-utf-8․asd file
Lisp File, trivial-utf-8/trivial-utf-8.lisp: The trivial-utf-8/trivial-utf-8․lisp file

T
trivial-utf-8.asd: The trivial-utf-8․asd file
trivial-utf-8/trivial-utf-8.lisp: The trivial-utf-8/trivial-utf-8․lisp file

Jump to:   F   L   T  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.2 Functions

Jump to:   A   F   G   M   R   S   U   W  
Index Entry  Section

A
as-utf-8-bytes: Internal macros

F
Function, get-utf-8-character: Internal functions
Function, read-utf-8-string: Exported functions
Function, string-to-utf-8-bytes: Exported functions
Function, utf-8-byte-length: Exported functions
Function, utf-8-bytes-to-string: Exported functions
Function, utf-8-group-size: Exported functions
Function, utf-8-string-length: Internal functions
Function, write-utf-8-bytes: Exported functions

G
get-utf-8-character: Internal functions

M
Macro, as-utf-8-bytes: Internal macros

R
read-utf-8-string: Exported functions

S
string-to-utf-8-bytes: Exported functions

U
utf-8-byte-length: Exported functions
utf-8-bytes-to-string: Exported functions
utf-8-group-size: Exported functions
utf-8-string-length: Internal functions

W
write-utf-8-bytes: Exported functions

Jump to:   A   F   G   M   R   S   U   W  

Next: , Previous: , Up: Indexes   [Contents][Index]

A.3 Variables

Jump to:   *  
B   M   S  
Index Entry  Section

*
*optimize*: Internal special variables

B
byte: Exported conditions

M
message: Exported conditions

S
Slot, byte: Exported conditions
Slot, message: Exported conditions
Special Variable, *optimize*: Internal special variables

Jump to:   *  
B   M   S  

Previous: , Up: Indexes   [Contents][Index]

A.4 Data types

Jump to:   C   P   S   T   U  
Index Entry  Section

C
Condition, utf-8-decoding-error: Exported conditions

P
Package, trivial-utf-8: The trivial-utf-8 package

S
System, trivial-utf-8: The trivial-utf-8 system

T
trivial-utf-8: The trivial-utf-8 system
trivial-utf-8: The trivial-utf-8 package

U
utf-8-decoding-error: Exported conditions

Jump to:   C   P   S   T   U