could you please give me some advice, I'm replacing the <chemform>
code from my wiki which is not used any more... The strings are usually simple like these:
<chemform>CH3COO-</chemform>
<chemform>Ba2+</chemform>
<chemform>H2CO3</chemform>
I need them to be replaced by these:
CH<sub>3</sub>COO<sup>-</sup>
Ba<sub>2</sub><sup>+</sup>
H<sub>2</sub>CO<sub>3</sub>
So far I came up with this regexp for the RegExr tool:
match: <chemform\b[^>]*>(\D*?)([0-9]*)(\D*?)(\D*?)([0-9]*)(\D*?)([-+]*?)</chemform>
replace: $1<sub>$2</sub>$3$4<sub>$5</sub>$6<sup>$7</sup>
I know the code is horrible, but so far it's been working for me except for the fact it's getting me empty strings like <sub></sub>
:
<sub></sub>CH<sub>3</sub>COO<sup>-</sup>
<sub></sub>Ba<sub>2</sub><sup>+</sup>
H<sub>2</sub>CO<sub>3</sub><sup></sup>
How can I get rid of these without doing second replace search? Thanks a lot!