2

I have a bunch of Jupyter Notebooks with equations written in LaTex. I know that I can convert the notebook to HTML as follows.

jupyter nbconvert --to html --template basic test.ipynb test.html

However, the LaTex markup are preserved. For example, if I have $y = w'x$, then that still shows up in the output HTML. I want to get this generated HTML into WordPress (basically, copy/paste), but WordPress delimits LaTex like this $latex y = w'x$.

How can I use perl or sed (or anything else) to convert $y = w'x to $latex y = w'x$?

I know I can just write a program to do it, but I think that's an overkill because I am sure these available command line tools can do it to. Additionally, any tool that is available on both Windows and Mac/Linux would be a bonus, since I work on both environment types, and do not want to have to resort to a *nix like environment to do this conversion (though, Windows does have Windows Linux Subsystem now, so I guess that might be ok if it's just a Linux tool).

I tried to modify this sed expression from this post (on Mac), but it did not work.

sed -e ' /\$\$/{s/\$\$/{\$latex }/;:a;N;/\$\$/!ba;s/\$\$/{\$}/};s/^\(\$\)\(.*\)\(\$\)$/{\$latex }\2{\$}/' test.html

unexpected EOF (pending }'s)

simbabque
  • 53,749
  • 8
  • 73
  • 136
Jane Wayne
  • 8,205
  • 17
  • 75
  • 120

2 Answers2

1

You can use the following sed command:

sed -e 's/\$\([^$]*\)\$/$\\latex \1$/g'

INPUT:

$ echo -e "abc \$y = w'x$ toto\n123 \$u = v'w$ xyz"
abc $y = w'x$ toto
123 $u = v'w$ xyz

OUTPUT:

echo -e "abc \$y = w'x$ toto\n123 \$u = v'w$ xyz" | sed -e 's/\$\([^$]*\)\$/$\\latex \1$/g'
abc $\latex y = w'x$ toto
123 $\latex u = v'w$ xyz

Explanations:

You use sed in find and replace mode and it will replace everything between two $ characters via this regex: \$\([^$]*\)\$ by what is already here (backreference) and add \latex at the beginning.

Last but not least, the following sed -e 's/\$\([^$]*\)\$/$latex \1$/g' sed command does the $latex replacement without the \

Allan
  • 12,117
  • 3
  • 27
  • 51
  • Cool. That worked! Could you modify it to remove the output to `$\latex` and just `$latex` since the latter is compliant with WordPress and so other users won't be confused? I'll accept this answer as it completely works (minus the output of `$\latex` part. I've already modified it on my test here and it works; just need the edit in this answer. – Jane Wayne May 10 '18 at 03:22
0

I am not sure what the whole format of your file might be. If I take your example at face value and assume that the first $ on any line is in column 1 then

sed -e's/^\$/$\\latex /'

works for me. But if the string to convert is anywhere in the file then things get harder.

7 Reeds
  • 2,419
  • 3
  • 32
  • 64
  • That does not seem to work; I don't observe any of the delimiters being swapped out. And yes, there will be many equations delimited by `$` pairs throughout the generated HTML. Again, to be clear, I create a Juypter Notebook `ipynb` file (which is `JSON`), and then use the `juypter nbconvert` tool to create a `HTML` file. The whole format that I would be operating against is a HTML file. – Jane Wayne May 10 '18 at 03:15
  • Well, I'm not that familiar with Juypter Notebook. Do all equations start with `$`? Rather you might try `sed -e's/\$/$\\latex /'`. – 7 Reeds May 10 '18 at 03:22