17

I'm using Sphinx to document a python project. I would like to use Markdown in my docstrings to format them. Even if I use the recommonmark extension, it only covers the .md files written manually, not the docstrings.

I use autodoc, napoleon and recommonmark in my extensions.

How can I make sphinx parse markdown in my docstrings?

bad_coder
  • 11,289
  • 20
  • 44
  • 72
Florentin Hennecker
  • 1,974
  • 23
  • 37
  • A [search of the docs](http://www.sphinx-doc.org/en/master/search.html?q=markdown) returns [this first result](https://www.sphinx-doc.org/en/master/usage/markdown.html?highlight=markdown). – Steve Piercy May 09 '19 at 16:05
  • 3
    Yes, and that result talks about `recommonmark`, which only covers the use case of writing manual documentation in markdown. That extension doesn't make sphinx parse your docstrings as markdown. I edited my question to make that clear – Florentin Hennecker May 09 '19 at 18:17
  • 1
    You're not the first to ask [this question](https://stackoverflow.com/q/49864260/712526), but unfortunately, there's no answer there, either. – jpaugh May 09 '19 at 18:25

4 Answers4

26

Sphinx's Autodoc extension emits an event named autodoc-process-docstring every time it processes a doc-string. We can hook into that mechanism to convert the syntax from Markdown to reStructuredText.

Unfortunately, Recommonmark does not expose a Markdown-to-reST converter. It maps the parsed Markdown directly to a Docutils object, i.e., the same representation that Sphinx itself creates internally from reStructuredText.

Instead, I use Commonmark for the conversion in my projects. Because it's fast — much faster than Pandoc, for example. Speed is important as the conversion happens on the fly and handles each doc-string individually. Other than that, any Markdown-to-reST converter would do. M2R2 would be a third example. The downside of any of these is that they do not support Recommonmark's syntax extensions, such as cross-references to other parts of the documentation. Just the basic Markdown.

To plug in the Commonmark doc-string converter, make sure that package is installed (pip install commonmark) and add the following to Sphinx's configuration file conf.py:

import commonmark

def docstring(app, what, name, obj, options, lines):
    md  = '\n'.join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)
    lines.clear()
    lines += rst.splitlines()

def setup(app):
    app.connect('autodoc-process-docstring', docstring)

Meanwhile, Recommonmark was deprecated in May 2021. The Sphinx extension MyST, a more feature-rich Markdown parser, is the replacement recommended by Sphinx and by Read-the-Docs. With MyST, one could use the same "hack" as above to get limited Markdown support. Though in February 2023, the extension Sphinx-Autodoc2 was published, which promises full (MyST-flavored) Markdown support in doc-strings, including cross-references.

A possible alternative to the approach outlined here is using MkDocs with the MkDocStrings plug-in, which would eliminate Sphinx and reStructuredText entirely from the process.

john-hen
  • 4,410
  • 2
  • 23
  • 40
  • Nice solution thanks! (I would use `lines[:] = rst.splitlines()`) – Roi Snir Jul 13 '20 at 21:01
  • Surprisingly to me, commonmark parses rst-formatted comments kinda OK, but still makes some errors. So I made a simple change to this for projects that have a mix of rst and MD comments. It only parses the docstring as markdown if the docstring starts out with a special indicator line. (I've got a large project wherein I really wanted to write some long stuff in markdown but didn't want to go convert all docstrings to markdown) https://gist.github.com/0f555c8f9c129c0ac6fed6fabe49078b#file-docstrings-py – Dustin Wyatt May 25 '21 at 20:07
0

Building on @john-hennig answer, the following will keep the restructured text fields like: :py:attr:, :py:class: etc. . This allows you to reference other classes, etc.

import re
import commonmark

py_attr_re = re.compile(r"\:py\:\w+\:(``[^:`]+``)")

def docstring(app, what, name, obj, options, lines):
    md  = '\n'.join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)
    lines.clear()
    lines += rst.splitlines()

    for i, line in enumerate(lines):
        while True:
            match = py_attr_re.search(line)
            if match is None:
                break 

            start, end = match.span(1)
            line_start = line[:start]
            line_end = line[end:]
            line_modify = line[start:end]
            line = line_start + line_modify[1:-1] + line_end
        lines[i] = line

def setup(app):
    app.connect('autodoc-process-docstring', docstring)
driedler
  • 3,750
  • 33
  • 26
0

I had to extend the accepted answer by john-hen to allow multi-line descriptions of Args: entries to be considered a single parameter:

def docstring(app, what, name, obj, options, lines):
  wrapped = []
  literal = False
  for line in lines:
    if line.strip().startswith(r'```'):
      literal = not literal
    if not literal:
      line = ' '.join(x.rstrip() for x in line.split('\n'))
    indent = len(line) - len(line.lstrip())
    if indent and not literal:
      wrapped.append(' ' + line.lstrip())
    else:
      wrapped.append('\n' + line.strip())
  ast = commonmark.Parser().parse(''.join(wrapped))
  rst = commonmark.ReStructuredTextRenderer().render(ast)
  lines.clear()
  lines += rst.splitlines()

def setup(app):
  app.connect('autodoc-process-docstring', docstring)
danijar
  • 32,406
  • 45
  • 166
  • 297
  • Since the "edit queue is full", in my implementation I changed the line `wrapped.append('\n' + line.strip())` to: `if literal: wrapped.append('\n' + line) else: wrapped.append('\n' + line.strip())`. This ensures indentation of literal blocks is preserved. – Rick Jun 29 '22 at 10:28
0

The current @john-hennig is great, but seems to be failing for multi-line Args: in python style. Here was my fix:


def docstring(app, what, name, obj, options, lines):
    md = "\n".join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)

    lines.clear()
    lines += _normalize_docstring_lines(rst.splitlines())


def _normalize_docstring_lines(lines: list[str]) -> list[str]:
    """Fix an issue with multi-line args which are incorrectly parsed.

    ```
    Args:
        x: My multi-line description which fit on multiple lines
          and continue in this line.
    ```

    Is parsed as (missing indentation):

    ```
    :param x: My multi-line description which fit on multiple lines
    and continue in this line.
    ```

    Instead of:

    ```
    :param x: My multi-line description which fit on multiple lines
        and continue in this line.
    ```

    """
    is_param_field = False

    new_lines = []
    for l in lines:
        if l.lstrip().startswith(":param"):
            is_param_field = True
        elif is_param_field:
            if not l.strip():  # Blank line reset param
                is_param_field = False
            else:  # Restore indentation
                l = "    " + l.lstrip()
        new_lines.append(l)
    return new_lines


def setup(app):
    app.connect("autodoc-process-docstring", docstring)
Conchylicultor
  • 4,631
  • 2
  • 37
  • 40
  • 1
    I understand the desire to have Google-style doc-strings. Apparently many people are used to that style. (Your answer is the second on this page focusing on that.) But it should be noted that this is not Markdown. What you do there is not a fix, it's a custom extension of the Markdown syntax. – john-hen Jan 05 '22 at 22:29