Skip to content

adfernandes/mdbook-pandoc

 
 

Repository files navigation

mdbook-pandocLatest Version

A pandoc-powered mdbook backend. By relying on pandoc, many output formats are supported, although this project was mainly developed with LaTeX in mind.

See Rendered Books for samples of rendered books.

Installation

  • Install mdbook

  • Install mdbook-pandoc:

    To install the latest release published to crates.io:

    cargo install mdbook-pandoc --locked

    The install the latest version committed to GitHub:

    cargo install mdbook-pandoc --git https://github.com/max-heller/mdbook-pandoc.git --locked
  • Install pandoc

    Note: mdbook-pandoc works best with Pandoc 2.10.1 (released July 2020) or newer -- ideally, the newest version you have access to. Older versions (as old as 2.8, released Nov 2019) are partially supported, but will result in degraded output.

    If you have an old version of Pandoc installed (in particular, Ubuntu releases before 23.04 have older-than-recommended Pandoc versions in their package repositories), consider downloading a newer version from Pandoc's installation page.

Getting Started

Instruct mdbook to use mdbook-pandoc by updating your book.toml file. The following example configures mdbook-pandoc to generate a PDF version of the book with LaTeX (which must be installed). To generate other output formats, see Configuration.

[book]
title = "My First Book"

+ [output.pandoc.profile.pdf]
+ output-file = "output.pdf"
+ to = "latex"

Running mdbook build will write the rendered book to pdf/output.pdf in mdbook-pandoc's build directory (book/pandoc if multiple renderers are configured; book otherwise).

Configuration

Since mdbook-pandoc supports many different output formats through pandoc, it must be configured to render to one or more formats through the [output.pandoc] table in a book's book.toml file.

Configuration is centered around output profiles, named packages of options that mdbook-pandoc passes to pandoc as a defaults file to render a book in a particular format. The output for each profile is written to a subdirectory with the same name as the profile under mdbook-pandoc's top-level build directory (book/pandoc if multiple renderers are configured; book otherwise).

A subset of the available options are described below:

Note: Pandoc is run from the book's root directory (the directory containing book.toml). Therefore, relative paths in the configuration (e.g. values for include-in-header, reference-doc) should be written relative to the book's root directory.

[output.pandoc]
hosted-html = "https://doc.rust-lang.org/book" # URL of a HTML version of the book

[output.pandoc.markdown.extensions] # enable additional Markdown extensions
gfm = false # enable pulldown-cmark's GitHub Flavored Markdown extensions
math = false # parse inline ($a^b$) and display ($$a^b$$) math
definition-lists = false # parse definition lists
superscript = false # parse superscripted text (^this is superscripted^)
subscript = false # parse subscripted text (~this is subscripted~)

[output.pandoc.code]
# Display hidden lines in code blocks (e.g., lines in Rust blocks prefixed by '#').
# See https://rust-lang.github.io/mdBook/format/mdbook.html?highlight=hidden#hiding-code-lines
show-hidden-lines = false

[output.pandoc.profile.<name>] # options to pass to Pandoc (see https://pandoc.org/MANUAL.html#defaults-files)
output-file = "output.pdf" # output file (within the profile's build directory)
to = "latex" # output format

# PDF-specific settings
pdf-engine = "pdflatex" # engine to use to produce PDF output

# `mdbook-pandoc` overrides Pandoc's defaults for the following options to better support mdBooks
file-scope = true # parse each file individually before combining
number-sections = true # number sections headings
standalone = true # produce output with an appropriate header and footer
table-of-contents = true # include an automatically generated table of contents

# Arbitrary other Pandoc options can be specified as they would be in a Pandoc defaults file
# (see https://pandoc.org/MANUAL.html#defaults-files) but written in TOML instead of YAML...

# For example, to pass variables (https://pandoc.org/MANUAL.html#variables):
[output.pandoc.profile.<name>.variables]
# Set the pandoc variable named 'variable-name' to 'value'
variable-name = "value"

Features

  • Markdown extensions supported by mdBook

  • Markdown extensions not yet supported by mdBook

    These extensions are disabled by default for consistency with mdBook and must be explicitly enabled.

    • Blockquote tags (Enabled by output.pandoc.markdown.extensions.gfm)
    • Math (Enabled by output.pandoc.markdown.extensions.math)
    • Definition Lists (Enabled by output.pandoc.markdown.extensions.definition-lists)
    • Superscript (Enabled by output.pandoc.markdown.extensions.superscript)
    • Subscript (Enabled by output.pandoc.markdown.extensions.subscript)
  • Raw HTML (best effort, almost always lossy)

    • Linking to HTML elements by id
    • Strikthrough (<s>), superscript (<sup>), subscript (<sub>)
    • Definition lists (<dl>, <dt>, <dd>)
    • Images (<img>) with width and height attributes
      • Class-based CSS styling (width/height)
    • <span>s and <div>s
    • Anchors (<a>)
  • Table of contents

  • Redirects ([output.html.redirect])

  • Font Awesome 4 icons (e.g. <i class="fa fa-github"></i>) (LaTeX output formats only)

Rendering Pipeline

To render a book, mdbook-pandoc parses the book's source (Parsing), transforms it into Pandoc's native representation (Preprocessing), then runs pandoc to render the book in the desired output format.

Parsing

HTML

mdbook-pandoc does its best to support raw HTML embedded in Markdown documents, transformating it into relevant Pandoc AST elements where possible. Each chapter is parsed into a hybrid Markdown+HTML tree using pulldown-cmark and the browser-grade html5ever HTML parser. This approach captures the full structure of the document -- including implicitly closed elements and other HTML quirks -- and makes it possible to accurately render HTML elements containing Markdown elements containing HTML elements...

This approach should also make mdbook-pandoc better able to handle malformed HTML, since html5ever performs the same HTML sanitization magic that browsers do. However, the standard principle applies: garbage in, garbage out; for best results, write simple and obviously correct HTML.

Preprocessing

Structural Changes

  • In order to make section numbers and the generated table of contents, if applicable, mirror the chapter hierarchy defined in SUMMARY.md:
    • Headings in nested chapters are shrunk one level per level of nesting
    • All headings except for H1s are marked as unnumbered and unlisted
  • Relative links within chapters are "rebased" to be relative to the source directory so a chapter src/foo/foo.md can link to src/foo/bar.md with [bar](bar.md)

Known Issues

  • Linking to a chapter does not work unless the chapter contains a heading with a non-empty identifier (either auto-generated or explicitly specified). See: max-heller#100

Comparison to alternatives

Rendered books

The following table links to sample books rendered with mdbook-pandoc. PDFs are rendered with LaTeX (LuaTeX).

Book Rendered
Cargo Book PDF
mdBook Guide PDF
Rustonomicon PDF
Rust Book PDF
Rust by Example PDF
Rust Edition Guide PDF
Embedded Rust Book PDF
Rust Reference PDF
Rust Compiler Development Guide PDF

Rendering to PDF

  • When mdbook-pandoc was initially written, existing mdbook LaTeX backends (mdbook-latex, mdbook-tectonic) were not mature enough to render much besides the simplest books due to hand-rolling the markdown->LaTeX conversion step. mdbook-pandoc, on the other hand, delegates this difficult step to pandoc, inheriting its maturity and configurability.
  • "Print to PDF"-based backends like mdbook-pdf are more mature, but produce less aesthetically-pleasing PDFs. Additionally, mdbook-pdf does not support intra-document links or generating a table of contents without using a forked version of mdbook.

Rendering to other formats

  • By delegating most of the difficult rendering work to pandoc, mdbook-pandoc supports numerous output formats. Most of these have not been tested, so feedback on how it performs on non-PDF formats is very welcome!

About

A mdbook backend powered by Pandoc.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 99.4%
  • Shell 0.6%