4. Notebook Conversion¶
4.1. Command Line Interface¶
The nbpublish
script is a command line interface for initialising
and parsing options to the conversion process.
To see all options for this script:
nbpublish -h
For example, to convert the Example.ipynb notebook directly to PDF:
nbpublish -pdf -lb example/notebooks/Example.ipynb
If a folder is input, then the .ipynb files it contains are processed and combined in ‘natural’ sorted order, i.e. 2_name.ipynb before 10_name.ipynb. By default, notebooks beginning ’_’ are ignored.
4.2. Python API¶
nbpublish
parses command line inputs to the
ipypublish.convert.main.IpyPubMain
class,
which is a traitlets.config
Configuration object that controls
the entire conversion process. For example, to list all configurable options:
from ipypublish.convert.main import IpyPubMain
IpyPubMain.class_print_help()
See also
4.3. Built-in Export Configurations¶
All available converters are listed by nbpublish --list-exporters
.
Some of note are:
- latex_ipypublish_main
the default and converts cells to latex according to metadata tags on an ‘opt in’ basis. Note that, for this converter, no code cells or output will appear in the final tex/pdf document unless they have a suitable ipub metadata tag.
- sphinx_ipypublish_main
converts the entire notebook(s) to an RST file in the Sphinx document generation, format.
- sphinx_ipypublish_main.run
The same as sphinx_ipypublish_main, but also creates a conf.py file and runs sphinx-build, to create HTML documentation (see Introduction).
- html_ipypublish_main
converts the entire notebook(s) to HTML and adds a table of contents sidebar and a button to toggle input code and output cells visible/hidden, with latex citations and references resolved.
- slides_ipypublish_main
converts the notebook to reveal.js slides, with latex citations and references resolved and slide partitioning by markdown headers. See the Live Slideshows section for using
nbpresent
to serve these slides to a web-browser.
The all and nocode variants of these converters pre-process a copy of the notebook, to add default metadata tags to the notebook and all cells, such that all output is rendered (with or without the code)
Variants ending .exec will additionally execute the entire notebook (running all the cells and storing the output), before converting them.
Important
To use sphinx converters, IPyPublish must be installed with the sphinx extras:
pip install ipypublish[sphinx]
These are already included in the conda install.
4.3.1. A Note on PDF Conversion¶
The current nbconvert --to pdf
does not correctly resolve references
and citations (since it copies the files to a temporary directory).
Therefore nbconvert is only used for the initial
nbconvert --to latex
phase, followed by using latexmk
to create
the pdf and correctly resolve everything. To convert your own notebook
to PDF for the first time, a good route would be to use:
nbpublish -f latex_ipypublish_all -pdf -pbug -lb path/to/YourNotebook.ipynb
4.4. The IPyPublish Defaults¶
The ipypublish ‘main’ converters are designed with the goal of creating a single notebook, which may contain lots of exploratory code/outputs, mixed with final output, and that can be output as both a document (latex/pdf or html) and a presentation (reveal.js). The logic behind the default output is then:
For documents: all headings and body text is generally required, but only a certain subset of code/output
For slides: all headings are required, but most of the body text will be left out and sustituted with ‘abbreviated’ versions, and only a certain subset of code/output.
This leads to the following logic flow (discussed further in the Metadata Tags section):
4.4.1. latex_ipypublish_main and html_ipypublish_main¶
all cells: bypass “ignore” and “slideonly” tags
markdown cells: include all
code cells (input): only include if the “code” tag is present
code cells (output): only include if the following tags are present
“figure” for png/svg/pdf/jpeg or html (html only)
“table” or “equation” for latex or html (html only)
“mkdown” for markdown text
“text” for plain text
4.4.2. slides_ipypublish_main¶
all cells: bypass “ignore”
markdown cells: are first split into header (beggining #)/non-header components
headers: include all
non-headers: only include if “slide” tag
code cells (input): only include if the “code” tag is present
code cells (output): only include if the following tags are present
“figure” for png/svg/pdf/jpeg/html
“table” or “equation” for latex/html
“mkdown” for markdown text
“text” for plain text
Packages, such as pandas and matplotlib, use jupyter notebooks rich representation mechanics to store a single output in multiple formats. nbconvert (and hence ipypublish) then selects only the highest priority (compatible) format to be output. This allows, for example, for pandas DataFrames to be output as latex tables in latex documents and html tables in html documents/slides.
4.5. Simple Customisation of Outputs¶
To customise the output of the above defaults, simply download one of:
Then alter the cell_defaults
and nb_defaults
sections, and run:
nbpublish -f path/to/new_config.json input.ipynb