The markdown.cl Reference Manual

This is the markdown.cl Reference Manual, version 0.1.7, generated automatically by Declt version 4.0 beta 2 "William Riker" on Wed May 15 06:10:20 2024 GMT+0.

Table of Contents


1 Introduction


2 Systems

The main system appears first, followed by any subsystem dependency.


2.1 markdown.cl

A markdown parser for Common Lisp

Author

Andrew Danger Lyon <>

License

MIT

Version

0.1.7

Dependencies
  • cl-ppcre (system).
  • xmls (system).
  • split-sequence (system).
Source

markdown.cl.asd.

Child Components

3 Files

Files are sorted by type and then listed depth-first from the systems components trees.


3.1 Lisp


3.1.1 markdown.cl/markdown.cl.asd

Source

markdown.cl.asd.

Parent Component

markdown.cl (system).

ASDF Systems

markdown.cl.


3.1.2 markdown.cl/package.lisp

Source

markdown.cl.asd.

Parent Component

markdown.cl (system).

Packages

markdown.cl.


3.1.3 markdown.cl/util.lisp

Dependency

package.lisp (file).

Source

markdown.cl.asd.

Parent Component

markdown.cl (system).

Internals

3.1.4 markdown.cl/html.lisp

Dependency

util.lisp (file).

Source

markdown.cl.asd.

Parent Component

markdown.cl (system).

Public Interface

error-parsing-html (condition).

Internals

3.1.5 markdown.cl/parser.lisp

Dependency

html.lisp (file).

Source

markdown.cl.asd.

Parent Component

markdown.cl (system).

Public Interface
Internals

4 Packages

Packages are listed by definition order.


4.1 markdown.cl

Source

package.lisp.

Nickname

markdown

Use List

common-lisp.

Public Interface
Internals

5 Definitions

Definitions are sorted by export status, category, package, and then by lexicographic order.


5.1 Public Interface


5.1.1 Ordinary functions

Function: parse (markdown-string &key disable-parsers)

Parse a markdown string into HTML.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-file (path &key disable-parsers)

Parse a markdown file into HTML (returned as a string).

Package

markdown.cl.

Source

parser.lisp.


5.1.2 Conditions

Condition: error-parsing-html

Thrown then xmls cannot parse the HTML in a document (make sure your <img> tags are closed.

Package

markdown.cl.

Source

html.lisp.

Direct superclasses

error.


5.2 Internals


5.2.1 Special variables

Special Variable: *block-level-elements*

Stores all HTML tags considered block-level.

Package

markdown.cl.

Source

html.lisp.

Special Variable: *html-chunks*

Holds a hash table that harbors HTML tags from the destructive forces of markdown parsing until they are ready to be injected back into the document.

Package

markdown.cl.

Source

html.lisp.

Holds a hash table mapping link ids to URLs.

Package

markdown.cl.

Source

parser.lisp.

Special Variable: *list-recursion-level*
Package

markdown.cl.

Source

parser.lisp.

Special Variable: *nl*

Holds a string of a single newline character.

Package

markdown.cl.

Source

util.lisp.

Special Variable: *scanner-block-list-pos*

Detects if a block has a ul/ol section.

Package

markdown.cl.

Source

parser.lisp.

Special Variable: *scanner-blockquote*

A scanner for finding blockquotes.

Package

markdown.cl.

Source

parser.lisp.

Special Variable: *scanner-find-first-html-block-element*

A scanner that searches for HTML elements that are not inline.

Package

markdown.cl.

Source

parser.lisp.

Special Variable: *scanner-find-last-html-block-element*

A scanner that searches for HTML elements that are not inline.

Package

markdown.cl.

Source

parser.lisp.

Special Variable: *scanner-lazy-blockquote*

A scanner for finding blockquotes.

Package

markdown.cl.

Source

parser.lisp.

Special Variable: *tmp-storage*

Holds a hash table used for temporary blockquote storage.

Package

markdown.cl.

Source

parser.lisp.


5.2.2 Ordinary functions

Function: block-element-p (tag-name)

Test if a given HTML tag is a block-level element.

Package

markdown.cl.

Source

html.lisp.

Function: cleanup-code (str)

Let’s convert our {{markdown.cl|code|...}} tags to <code> tags.

Package

markdown.cl.

Source

parser.lisp.

Function: cleanup-escaped-characters (str)

Convert escaped characters back to non-escaped.

Package

markdown.cl.

Source

parser.lisp.

Function: cleanup-hr (str)

Due to the way processing <hr> tags occurs, they are always wrapped in <p> blocks. Instead of trying to figure out a way to NOT wrap them in <p> blocks (which would surely screw up the rest of the paragraph formatting) it makes more sense to let it happen, then fix in the final pass.

Package

markdown.cl.

Source

parser.lisp.

Function: cleanup-markdown-tags (str)
Package

markdown.cl.

Source

html.lisp.

Function: cleanup-newlines (str)

Here we remove excess newlines and convert any markdown.cl newlines into real ones.

Package

markdown.cl.

Source

parser.lisp.

Function: cleanup-paragraphs (str)

Remove any empty paragraph blocks (it does happen sometimes) and convert all markdown.cl paragraphs into real <p> tags.

Package

markdown.cl.

Source

parser.lisp.

Function: convert-lazy-blockquote-to-standard (str)

Converts a lazy blockquote:

> this a blockquote that spans multiple lines but
im too lazy to add the ’>’ at the beginning of each line

into:

> this a blockquote that
> spans multiple lines but > im too lazy to add the ’>’ > at the beginning of each line

Package

markdown.cl.

Source

parser.lisp.

Function: do-parse-br (str)

Parse <br> tags (when a line ends with two spaces).

Package

markdown.cl.

Source

parser.lisp.

Function: do-parse-code (str)

Parse ‘...‘ code blocks.

Package

markdown.cl.

Source

parser.lisp.

Function: do-parse-double-code (str)

Parse “...“ code blocks.

Package

markdown.cl.

Source

parser.lisp.

Function: do-parse-em (str)

Parse *, _, **, and __.

Package

markdown.cl.

Source

parser.lisp.

Function: do-parse-entities (str &key use-markdown-tags)

Replace non-purposeful entities with escaped equivalents.

Package

markdown.cl.

Source

parser.lisp.

Function: escape-html (str)

Meant to be called on text inside code blocks.

Package

markdown.cl.

Source

parser.lisp.

Function: escape-html-entities (str)

Hide HTML entities from the HTML parser. It doesn’t like them. It has the death penalty on 12 systems.

Package

markdown.cl.

Source

html.lisp.

Escape any underscores in href=... text so it’s not replaced with <em>s

Package

markdown.cl.

Source

parser.lisp.

Function: file-contents (path)

Sucks up an entire file from PATH into a freshly-allocated string, returning two values: the string and the number of bytes read.

Package

markdown.cl.

Source

util.lisp.

Function: fix-a-tags (str)

XMLS mangles our <a> tags. Fix

Package

markdown.cl.

Source

html.lisp.

Fix <http://teh-link.com> links, which messes with XMLS’ mind.

Package

markdown.cl.

Source

html.lisp.

Function: format-blockquote (str)

Given a string that we know is a blockquote, remove the blockquote formatting and recursively parse markdown within the blockquote. If the given blockquote is not ’lazy’ then lazy blockquote parsing is disabled in the recursive parse so as not to screw up formatting.

Package

markdown.cl.

Source

parser.lisp.

Function: format-code (str &key embedded)

Sanely formats code blocks.

Package

markdown.cl.

Source

parser.lisp.

Function: format-html-blocks-in-paragraph (str)

This is a very helpful function which turns:

<p>this is my text<div>this is inside a block</div> more text</p>

into:

<p>this is my text</p><div>this is inside a block</div><p>more text</p>

In other words, it unwraps <p> tags from around HTML block elements, and does so such that all text between the first block tag found and after the last block tag found is left untouched (and unwrapped by <p>).

Package

markdown.cl.

Source

parser.lisp.

Function: format-lists (str indent)

This is the function that actually makes lists happen. Once all the blocks have been diced up into neat little packages ready for formatting, they are handed off to format-lists.

This function is responsible for adding the <ul>/<ol>/<li> tags around list items, making sure to only do this for items using the correct indentation level.

List items inject any saved blockquotes (via inject-saved-blockquotes) before moving on to paragraph processing. This step is essential because a lot of the blockquote formatting can screw up the splitting of list items correctly, resulting in <p> blocks in really weird places.

List items are run through the paragraph filters, have a minimal amount of formatting applied to make sure the recursion goes smoothly, and then are recursively concated onto the final string.

Package

markdown.cl.

Source

parser.lisp.

Look for any link references in the document:

[link-id]: http://my-url.com
[4]: http://my-link.com (optional title)
[mylink]: http://my-url.com/lol ’kewl link brah’
[omg]: http://lol.com/wtf "rofl"

and parse them into the *link-references* hash table. The data will be pulled out when parse-links is called.

Note that as a side effect, this also gathers image references =].

Package

markdown.cl.

Source

parser.lisp.

Function: inject-saved-blockquotes (str)
Package

markdown.cl.

Source

parser.lisp.

Function: join-list-lines (str)

Turns lists broken into multiple lines into (per item) so that there’s one line per item:

- my list item
broken into multiple
lines

becomes

- my list item broken into multiple lines

Package

markdown.cl.

Source

parser.lisp.

Function: make-image (url alt title)
Package

markdown.cl.

Source

parser.lisp.

Package

markdown.cl.

Source

parser.lisp.

Function: normalize-lists (str)

Run bullets/lists through some normalization filters to make them easier to parse. Numbered lists are converted to +, regular bullets converted to -. This greatly simplifies parsing later on.

Package

markdown.cl.

Source

parser.lisp.

Function: pad-string (str padding)

There’s probably a lisp function for this already. Pads the beginning and end of the given string with the given padding (also a string).

Package

markdown.cl.

Source

parser.lisp.

Function: paragraph-format (str)

This function looks for {{markdown.cl|paragraph}} tags and splits up the text given accordingly, adding opening/closing markdown.cl paragraph tags around each of the splits. It then uses format-html-blocks-in-paragraph to remove any paragraph tags that shouldn’t be there.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-atx-headers (str)

Parses ATX-style headers:

### This will be an h3 tag lol

Package

markdown.cl.

Source

parser.lisp.

Function: parse-blockquote (str)

Parse a blockquote recursively, using the passed-in regex.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-br (str)

Parse <br> tags (when a line ends with two spaces).

Package

markdown.cl.

Source

parser.lisp.

Function: parse-code (str)

Parses code sections in markdown.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-em (str)

Parse *, _, **, and __, but only in non-code blocks. It’s tricky though, because our <em>/<strong> elements must be able to span across <code> blocks. What we do it replace any * objects in <code> blocks with a meta string, process the em/strong tags, and then replace the meta char. Works great.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-embedded-blockquote (str)

Parse blockquotes that occur inside a list. This must be a separate step, otherwise things can get wonky when parsing lists. The idea is to find blockquotes that are embedded in lists *before* the lists are processed, then turn them into what the list parser views as a standard paragraph.

Instead of injecting embedded blockquotes directly into the list string, they are saved in a hash table and injected afterwards for more accurate parsing.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-embedded-code (str)

Parses code that is embedded inside something else (a list, for instance). Generally, embedded code starts with 8 spaces instead of 4.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-entities (str)

On top of parsing entities:

I am a sicko & a perv => I am a sicko &amp; a perv
Dogs > cats => Dogs &gt; cats
<em>I’m the best</em> => <em>I’m the best</em>

also escape the inside of <code> blocks:

<code><div>&copy;</div></code>

becomes:

<code>&lt;div&gt;&amp;copy;&lt;/div&gt;</code>

It does this using the parse-not-in-code function, which operates inside code blocks, using do-parse-entities outside code blocks, and escaping everything inside code blocks.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-escaped-characters (str)

Parse characters that are escaped with \

Package

markdown.cl.

Source

parser.lisp.

Function: parse-horizontal-rule (str)

Make horizontal rules. These are (almost?) always wrapped in <p> tags by the paragraph parser, but this is taken care of in the final parsing pass.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-inline-code (str)

Parse ‘...‘ code blocks.

Package

markdown.cl.

Source

parser.lisp.

Parse all link styles. It’s important to note that because the image/link syntax is so similar, the following parsers handle both images and links.

Package

markdown.cl.

Source

parser.lisp.

Parse links that are reference-style: [link text][id]

Package

markdown.cl.

Source

parser.lisp.

Parse links that are self contained (not a reference): [my link text](http://url.com "title")

Package

markdown.cl.

Source

parser.lisp.

Function: parse-list-blocks (str)

This function takes a list block, and splits it into sub-blocks depending on list type (in other words, if a numbered list directly follows a normal list, the two are processed separately). This is done recursively.

It also detects the amount of intent the list uses, which it sends into ‘format-lists‘ when an entire block has been singled out.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-lists (str)

Parse lists (both bullet and number lists). First, normalizes them (which makes them a whole lot easier to parse) then recursively parses them.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-not-in-code (str parser-fn &key escape in-code-fn)

Given a string and a parsing function, run the parsing function over the parts of the string that are not inside any code block.

Also has the ability to escape the internals of code blocks.

Package

markdown.cl.

Source

parser.lisp.

Function: parse-paragraphs (str &key pre-formatted)

This formats paragraphs in text. Most blocks given to is are treated as paragraph blocks until otherwise noted. If it detects that another parser added in paragraph tags, it will skip the block *unless* the pre-formatted key arg is T (meaning that the string being passed in has paragraph tags in it that need to be dealt with).

This function also does its best to clean the output by ridding us of empty paragraph blocks.

Package

markdown.cl.

Source

parser.lisp.

Parse quick-link style: <http://killtheradio.net>

Package

markdown.cl.

Source

parser.lisp.

Function: parse-setext-headers (str)

Parse setext headers:

This will be an h1 ==================

Package

markdown.cl.

Source

parser.lisp.

Function: parse-table (str)

Parse github format tables. Takes a string formated per the github version of markdown. It returns an html table if the block contains a pipe character and the second non-whitespace line in the block contains at least three consecutive dashes e.g. ’—’. Otherwise it returns the original string. See https://help.github.com/articles/organizing-information-with-tables/. The columns can be aligned right, center or left if colons are inserted on the sides of the hyphens within the header row. At the moment you cannot use a pipe as content within the cell.

Package

markdown.cl.

Source

parser.lisp.

Function: post-process-markdown-html (str)

This function does any needed cleanup to marry inline HTML and markdown.

Package

markdown.cl.

Source

html.lisp.

Function: pre-format-paragraph-lists (str &optional join-list-items)

Format lists in paragraph style to be normalized so they aren’t chopped up by the rest of the parsing.

Package

markdown.cl.

Source

parser.lisp.

Function: pre-process-markdown-html (str)

This function performs any needed parsing on existing HTML of a markdown string.

Package

markdown.cl.

Source

html.lisp.

Function: prepare-markdown-string (str)

A lot of the regular expressions, in order to maintain simplicity, expect the strings they are given to be formatted in a certain way. For instance, instead of testing for (^|\n), if every string is started with a \n, then we can just test for \n (and leave out testing for the beginning of the string).

Package

markdown.cl.

Source

parser.lisp.

Function: replace-html-blocks (str)

Find any {{markdown.cl|htmlblock|...}} tags and replace them with their saved content.

Package

markdown.cl.

Source

html.lisp.

Function: split-blocks (str)

Splits a markdown document into a set of blocks, each block generally consisting of a certain type (list, blockquote, paragraph, etc).

Package

markdown.cl.

Source

parser.lisp.

Function: stash-html-block-tags (str)

Finds all top-level HTML block-level tags and saves them for later. Does so by incrementally searching for the next line starting with a block-level tag and using xmls to parse it, adding a placeholder in its stead. Inline elements are just added back into the parts array (not saved). This allows them to be markdown-processed. str is modified destructively as the loop progresses, making sure we don’t get stuck in endless loop finding the same tags over again.

Package

markdown.cl.

Source

html.lisp.


Appendix A Indexes


A.1 Concepts


A.2 Functions

Jump to:   B   C   D   E   F   G   I   J   M   N   P   R   S  
Index Entry  Section

B
block-element-p: Private ordinary functions

C
cleanup-code: Private ordinary functions
cleanup-escaped-characters: Private ordinary functions
cleanup-hr: Private ordinary functions
cleanup-markdown-tags: Private ordinary functions
cleanup-newlines: Private ordinary functions
cleanup-paragraphs: Private ordinary functions
convert-lazy-blockquote-to-standard: Private ordinary functions

D
do-parse-br: Private ordinary functions
do-parse-code: Private ordinary functions
do-parse-double-code: Private ordinary functions
do-parse-em: Private ordinary functions
do-parse-entities: Private ordinary functions

E
escape-html: Private ordinary functions
escape-html-entities: Private ordinary functions
escape-links-href: Private ordinary functions

F
file-contents: Private ordinary functions
fix-a-tags: Private ordinary functions
fix-inline-links: Private ordinary functions
format-blockquote: Private ordinary functions
format-code: Private ordinary functions
format-html-blocks-in-paragraph: Private ordinary functions
format-lists: Private ordinary functions
Function, block-element-p: Private ordinary functions
Function, cleanup-code: Private ordinary functions
Function, cleanup-escaped-characters: Private ordinary functions
Function, cleanup-hr: Private ordinary functions
Function, cleanup-markdown-tags: Private ordinary functions
Function, cleanup-newlines: Private ordinary functions
Function, cleanup-paragraphs: Private ordinary functions
Function, convert-lazy-blockquote-to-standard: Private ordinary functions
Function, do-parse-br: Private ordinary functions
Function, do-parse-code: Private ordinary functions
Function, do-parse-double-code: Private ordinary functions
Function, do-parse-em: Private ordinary functions
Function, do-parse-entities: Private ordinary functions
Function, escape-html: Private ordinary functions
Function, escape-html-entities: Private ordinary functions
Function, escape-links-href: Private ordinary functions
Function, file-contents: Private ordinary functions
Function, fix-a-tags: Private ordinary functions
Function, fix-inline-links: Private ordinary functions
Function, format-blockquote: Private ordinary functions
Function, format-code: Private ordinary functions
Function, format-html-blocks-in-paragraph: Private ordinary functions
Function, format-lists: Private ordinary functions
Function, gather-link-references: Private ordinary functions
Function, inject-saved-blockquotes: Private ordinary functions
Function, join-list-lines: Private ordinary functions
Function, make-image: Private ordinary functions
Function, make-link: Private ordinary functions
Function, normalize-lists: Private ordinary functions
Function, pad-string: Private ordinary functions
Function, paragraph-format: Private ordinary functions
Function, parse: Public ordinary functions
Function, parse-atx-headers: Private ordinary functions
Function, parse-blockquote: Private ordinary functions
Function, parse-br: Private ordinary functions
Function, parse-code: Private ordinary functions
Function, parse-em: Private ordinary functions
Function, parse-embedded-blockquote: Private ordinary functions
Function, parse-embedded-code: Private ordinary functions
Function, parse-entities: Private ordinary functions
Function, parse-escaped-characters: Private ordinary functions
Function, parse-file: Public ordinary functions
Function, parse-horizontal-rule: Private ordinary functions
Function, parse-inline-code: Private ordinary functions
Function, parse-links: Private ordinary functions
Function, parse-links-ref: Private ordinary functions
Function, parse-links-self: Private ordinary functions
Function, parse-list-blocks: Private ordinary functions
Function, parse-lists: Private ordinary functions
Function, parse-not-in-code: Private ordinary functions
Function, parse-paragraphs: Private ordinary functions
Function, parse-quick-links: Private ordinary functions
Function, parse-setext-headers: Private ordinary functions
Function, parse-table: Private ordinary functions
Function, post-process-markdown-html: Private ordinary functions
Function, pre-format-paragraph-lists: Private ordinary functions
Function, pre-process-markdown-html: Private ordinary functions
Function, prepare-markdown-string: Private ordinary functions
Function, replace-html-blocks: Private ordinary functions
Function, split-blocks: Private ordinary functions
Function, stash-html-block-tags: Private ordinary functions

G
gather-link-references: Private ordinary functions

I
inject-saved-blockquotes: Private ordinary functions

J
join-list-lines: Private ordinary functions

M
make-image: Private ordinary functions
make-link: Private ordinary functions

N
normalize-lists: Private ordinary functions

P
pad-string: Private ordinary functions
paragraph-format: Private ordinary functions
parse: Public ordinary functions
parse-atx-headers: Private ordinary functions
parse-blockquote: Private ordinary functions
parse-br: Private ordinary functions
parse-code: Private ordinary functions
parse-em: Private ordinary functions
parse-embedded-blockquote: Private ordinary functions
parse-embedded-code: Private ordinary functions
parse-entities: Private ordinary functions
parse-escaped-characters: Private ordinary functions
parse-file: Public ordinary functions
parse-horizontal-rule: Private ordinary functions
parse-inline-code: Private ordinary functions
parse-links: Private ordinary functions
parse-links-ref: Private ordinary functions
parse-links-self: Private ordinary functions
parse-list-blocks: Private ordinary functions
parse-lists: Private ordinary functions
parse-not-in-code: Private ordinary functions
parse-paragraphs: Private ordinary functions
parse-quick-links: Private ordinary functions
parse-setext-headers: Private ordinary functions
parse-table: Private ordinary functions
post-process-markdown-html: Private ordinary functions
pre-format-paragraph-lists: Private ordinary functions
pre-process-markdown-html: Private ordinary functions
prepare-markdown-string: Private ordinary functions

R
replace-html-blocks: Private ordinary functions

S
split-blocks: Private ordinary functions
stash-html-block-tags: Private ordinary functions