0

I have a problem when I use Python to generate pdf files from markdown. My goal here is to transform my documentation to pdf. To do that I already have a shell command which looks like this:

markdown <markdown filename>.md | htmldoc --cont --headfootsize 8.0 --linkcolor blue --linkstyle plain --charset utf-8 --format pdf14 - > <pdf filename>.pdf

To use it you need to install markdown and htmldoc:

sudo apt-get update
sudo apt-get install markdown
sudo apt-get install htmldoc

So now I want to automate the generation. I want to use python with its main library subprocess in 3.6 so here is the code:

import subprocess
import os
import sys
import getopt
import shutil


def list_markdown_file(path):
    # this function list all markdown file
    # param path = path to the target directory

    list_of_file = []
    for file in os.listdir(path):
        if file.endswith(".md") and not file == 'README.md':
            list_of_file.append(os.path.splitext(file)[0])
    return list_of_file


def generate_pdf(path, list_file):
    destination_dir = "pdf"
    if os.path.isdir(os.path.join(path, destination_dir)):
        shutil.rmtree(os.path.join(path, destination_dir))
    os.mkdir(os.path.join(path, destination_dir))

    for filename in list_file:
        subprocess.run(["markdown", filename+".md", "|", "htmldoc", "--cont",
                        "--headfootsize", "8.0", "--linkcolor", "blue", "--linkstyle",
                        "plain", "--charset", "utf-8", "--format", "pdf14", "-", ">",
                        os.path.join(path, destination_dir, filename+".pdf")], encoding='utf-8', stdout=subprocess.PIPE)


def main(argv):
    try:
        opts, args = getopt.getopt(argv, "hp:", ["path"])
    except getopt.GetoptError:
        print('python generate_pdf.py -p <path_to_directory>')
        sys.exit(2)
    path_to_file = ''
    for opt, arg in opts:
        if opt in ('-h', '--help'):
            print('python generate_pdf.py -p <path_to_directory>')
            sys.exit()
        elif opt in ("-p", "--path"):
            path_to_file = arg
    if not opts:
        print('python generate_pdf.py -h to see how its works')
        exit(2)
    list_of_file = list_markdown_file(path=path_to_file)
    generate_pdf(path=path_to_file, list_file=list_of_file)


if __name__ == '__main__':
    main(sys.argv[1:])

The problem is located in this part:

for filename in list_file:
    subprocess.run(["markdown", filename+".md", "|", "htmldoc", "--cont",
                    "--headfootsize", "8.0", "--linkcolor", "blue", "--linkstyle",
                    "plain", "--charset", "utf-8", "--format", "pdf14", "-", ">",
                    os.path.join(path, destination_dir, filename+".pdf")], encoding='utf-8', stdout=subprocess.PIPE)

When I do that only the part with markdown filename.md is run. Why is that? What can I do to fix that?

tripleee
  • 175,061
  • 34
  • 275
  • 318
Brice Harismendy
  • 113
  • 1
  • 12
  • You're using pipes in your commands. Per [this](https://stackoverflow.com/questions/13332268/how-to-use-subprocess-command-with-pipes) answer you need to specify `shell=True` in the `subprocess.run(...)` call. – Andrew F Feb 18 '19 at 10:49
  • I tried but now it get stuck and do nothing (no error ...) – Brice Harismendy Feb 18 '19 at 10:55

2 Answers2

0

You can Convert Markdwen File To PDF file using a Python Module Named Markdown2PDF, Install it in Python 3 by sudo pip3 install Markdown2PDF. Open the Terminal and write md2pdf <file_name> like md2pdf test.md to convert to pdf.

0

subprocess without shell=True runs a single subprocess. If you want to run a complete pipeline, you need to use shell=True or run each process separately.

Trivially but unattractively with shell=True:

for filename in list_file:
    # Switch run([list, of, things]) to (run("string of things", shell=True)
    subprocess.run("""markdown '{0}.md' |
        htmldoc --cont --headfootsize 8.0 --linkcolor blue --linkstyle plain \\
            --charset utf-8 --format pdf14 - >'{1}'""".format(
            filename, os.path.join(path, destination_dir, filename+".pdf"),
        shell=True)

Perhaps slightly more elegantly

for filename in list_file:
    with open(os.path.join(path, destination_dir, filename+".pdf")) as dest:
        subprocess.run("""markdown '{0}.md' |
            htmldoc --cont --headfootsize 8.0 --linkcolor blue --linkstyle plain \\
                --charset utf-8 --format pdf14 -""".format(filename),
            shell=True, stdout=dest, universal_newlines=True, check=True)

You could also get rid of shell=True and run two separate processes; see Replacing shell pipeline in the subprocess documentation.

Just to make this explicit, subprocess.run(['foo', 'bar', '|' 'baz']) runs the program foo with the arguments bar, |, and baz; not two processes where the second is baz and the standard input of the second is connected to the standard output of the first, which is what the shell does when you run a pipeline.

tripleee
  • 175,061
  • 34
  • 275
  • 318