Converting Markdown to PDF

For some reason or another, I had the hankering to convert a few documents that I’d written using Markdown to PDF. I couldn’t find a way to do it directly, but I managed to come up with an interesting and (at least for me) useful little hack.

The key to this conversion is HTMLDOC, a tool that converts HTML files to PDF. I’ve used HTMLDOC for years, and it’s been a useful part of my toolkit. I’ve not only used this application (available through Synaptic, by the way) to convert single HTML files to PDF but to combine multiple files into PDF books. While HTMLDOC has a few weaknesses, it’s more than good enough for my purposes.

So, to convert my Markdown files to PDF, I had to run them through the Markdown script to get HTML. Then, I ran the file through HTMLDOC using the following command:

htmldoc myFile.html > myFile.pdf

The results were OK, but I wanted to tweak a few things like the size of the header and footer fonts and the appearance of any links in a file (which work in the PDF, by the way). It’s been a while since I’ve used HTMLDOC at the command line, so I had to turn to the documentation for help with the options.

In the end, I came up with the following command string:

htmldoc --cont --headfootsize 8.0 --linkcolor blue --linkstyle plain --format pdf14 myFile.html > myFile.pdf

What all that means is:

  • There will be no table of contents generated (--toc)
  • The header and footer fonts will be 8 points (--headfootsize 8.0)
  • Links will be blue without an underline (--linkcolor blue --linkstyle plain)
  • The resulting PDF will be compatible with Acrobat 5.0 (--format pdf14)

Of course, that and the command to run Markdown is a lot to type. So, I just encapsulated all of the commands in a script like this:

markdown $1.text > $1.html;htmldoc --cont --headfootsize 8.0 --linkcolor blue --linkstyle plain --format pdf14 $1.html > $1.pdf;rm $1.html

By specifying $1, I can type the name of the Markdown file without its extension, and that name will be propagated throughout the conversion. At the end, the HTML file gets deleted because I don’t need it any more.

Do you have another way of doing this? If so, leave a comment.

This work, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

flattr this!

  • Joe

    No other way to do it, except using pandoc [1] (see markdown2pdf) which I think is in ubuntu. You can simplify your script a little to remove the intermediate step of writing to html file by piping markdown straight into htmldoc using the “-” argument for htmldoc.

    markdown $1.text | htmldoc --cont --headfootsize 8.0 --linkcolor blue --linkstyle plain --format pdf14 - > $1.pdf

  • scott

    @Joe,

    Thanks for the comment. I’ve been playing with Pandoc for a while (another project), and might just use that instead of my HTMLDOC script. Although, it’s nice to have different tools handy …

  • http://www.crunkey.com Bryan

    You could use maruku, a ruby library, passing it the –pdf option.

  • http://convert-wma-to-mp3.biz/convert-protected-wma.html protectedwma

    Exactly!

  • Pingback: Writing, the Developer Way « The Best Blog on the Internet

  • Fredrik Wallgren

    You can use https://github.com/walle/gimli to do it directly.

  • Pingback: Ubuntu Musings» Blog Archive » Playing with Pandoc