4. Notebook Conversion

4.1. Command Line Interface

The nbpublish script is a command line interface for initialising and parsing options to the conversion process. To see all options for this script:

nbpublish -h

For example, to convert the Example.ipynb notebook directly to PDF:

nbpublish -pdf -lb example/notebooks/Example.ipynb

If a folder is input, then the .ipynb files it contains are processed and combined in ‘natural’ sorted order, i.e. 2_name.ipynb before 10_name.ipynb. By default, notebooks beginning ’_’ are ignored.

4.2. Python API

nbpublish parses command line inputs to the ipypublish.convert.main.IpyPubMain class, which is a traitlets.config Configuration object that controls the entire conversion process. For example, to list all configurable options:

from ipypublish.convert.main import IpyPubMain
IpyPubMain.class_print_help()

4.3. Built-in Export Configurations

All available converters are listed by nbpublish --list-exporters. Some of note are:

latex_ipypublish_main

the default and converts cells to latex according to metadata tags on an ‘opt in’ basis. Note that, for this converter, no code cells or output will appear in the final tex/pdf document unless they have a suitable ipub metadata tag.

sphinx_ipypublish_main

converts the entire notebook(s) to an RST file in the Sphinx document generation, format.

sphinx_ipypublish_main.run

The same as sphinx_ipypublish_main, but also creates a conf.py file and runs sphinx-build, to create HTML documentation (see Introduction).

html_ipypublish_main

converts the entire notebook(s) to HTML and adds a table of contents sidebar and a button to toggle input code and output cells visible/hidden, with latex citations and references resolved.

slides_ipypublish_main

converts the notebook to reveal.js slides, with latex citations and references resolved and slide partitioning by markdown headers. See the Live Slideshows section for using nbpresent to serve these slides to a web-browser.

The all and nocode variants of these converters pre-process a copy of the notebook, to add default metadata tags to the notebook and all cells, such that all output is rendered (with or without the code)

Variants ending .exec will additionally execute the entire notebook (running all the cells and storing the output), before converting them.

Important

To use sphinx converters, IPyPublish must be installed with the sphinx extras:

pip install ipypublish[sphinx]

These are already included in the conda install.

4.3.1. A Note on PDF Conversion

The current nbconvert --to pdf does not correctly resolve references and citations (since it copies the files to a temporary directory). Therefore nbconvert is only used for the initial nbconvert --to latex phase, followed by using latexmk to create the pdf and correctly resolve everything. To convert your own notebook to PDF for the first time, a good route would be to use:

nbpublish -f latex_ipypublish_all -pdf -pbug -lb path/to/YourNotebook.ipynb

4.4. The IPyPublish Defaults

The ipypublish ‘main’ converters are designed with the goal of creating a single notebook, which may contain lots of exploratory code/outputs, mixed with final output, and that can be output as both a document (latex/pdf or html) and a presentation (reveal.js). The logic behind the default output is then:

  • For documents: all headings and body text is generally required, but only a certain subset of code/output

  • For slides: all headings are required, but most of the body text will be left out and sustituted with ‘abbreviated’ versions, and only a certain subset of code/output.

This leads to the following logic flow (discussed further in the Metadata Tags section):

4.4.1. latex_ipypublish_main and html_ipypublish_main

  • all cells: bypass “ignore” and “slideonly” tags

  • markdown cells: include all

  • code cells (input): only include if the “code” tag is present

  • code cells (output): only include if the following tags are present

    • “figure” for png/svg/pdf/jpeg or html (html only)

    • “table” or “equation” for latex or html (html only)

    • “mkdown” for markdown text

    • “text” for plain text

4.4.2. slides_ipypublish_main

  • all cells: bypass “ignore”

  • markdown cells: are first split into header (beggining #)/non-header components

    • headers: include all

    • non-headers: only include if “slide” tag

  • code cells (input): only include if the “code” tag is present

  • code cells (output): only include if the following tags are present

    • “figure” for png/svg/pdf/jpeg/html

    • “table” or “equation” for latex/html

    • “mkdown” for markdown text

    • “text” for plain text

Packages, such as pandas and matplotlib, use jupyter notebooks rich representation mechanics to store a single output in multiple formats. nbconvert (and hence ipypublish) then selects only the highest priority (compatible) format to be output. This allows, for example, for pandas DataFrames to be output as latex tables in latex documents and html tables in html documents/slides.

4.5. Simple Customisation of Outputs

To customise the output of the above defaults, simply download one of:

Then alter the cell_defaults and nb_defaults sections, and run:

nbpublish -f path/to/new_config.json input.ipynb