0

I want to write a simple Markdown to LaTex converter and chose sed as the core component of the converter. It was suitable for everything until now when I hit the following problem: I want to convert a markdown code block (3 backticks) into a LaTex listing. The problem is that I want to work on multiple lines here. I tried the following command but it does not work since sed is processing the input line by line:

sed -E 's/```([[:print:]]*)```/\\begin{lstlisting}/1\\end{lstlisting}/g'

Another idea would be to try to only search and replace only the three backticks, but since every other occurrence needs to be replaced with \end{lstlisting} I do not know if it is possible. A hacky way would be to use three backticks for the start of the code block and four for the end, but that is quite a dirty solution in my opinion.

Sebastian Dine
  • 815
  • 8
  • 23
  • Does this answer your question? [Match a string that contains a newline using sed](https://stackoverflow.com/questions/23850789/match-a-string-that-contains-a-newline-using-sed) – MayeulC Jun 30 '22 at 08:34
  • 1
    Firstly, that question has been asked many times: https://stackoverflow.com/questions/1251999/ and https://stackoverflow.com/questions/23850789 to name two. Secondly, unless this is something you intend to throw away after use, I'd direct you to writing a proper parser (or using an existing one as a library): tokenization, AST tree, etc. Lastly, I believe `pandoc` does most of what you seem to want to achieve. – MayeulC Jun 30 '22 at 08:37

1 Answers1

1

This might work for you (GNU sed):

sed -E '/^```/{:a;N;/\n```$/!ba
        s/^```(.*)```$/\\begin{lstlisting}\1\\end{lstlisting}/}' file

On encountering ``` at the beginning of a line, gather up all lines untill another such line and replace those lines by \begin{lstlisting} and \end{lstlisting}.

potong
  • 55,640
  • 6
  • 51
  • 83