I'm trying to use regex to clean up some code generated in my own html5 RTE. Searching around i see a lot of people saying regex should not be used to parse html... but i'm doing this clientside with JavaScript. Do i have any other option than regex?
I have been trying to use lookbehinds (just found out about them) but they dont seem to work with JavaScript. What i want to do is delete all <br> at the very end of <p>'s, but not those that are the only element in the paragraph, like <p><br></p>. So:
<p>Blah<br><br><br></p> becomes <p>Blah</p>
<p><br></p> stays the same.
So far i only have
html = html.replace(/(?:<br\s?\/?>)+(<\/p>)/g, '$1');
Which will delete all <br>'s at the end of a paragraph, no matter how many.
I would like something like
html = html.replace(/(?<!<p>)(?:<br\s?\/?>)+(<\/p>)/g, '$1');
EDIT: i'm using a contenteditable div to create a very simple RTE that is dynamically created everytime a user wants to change some text. basically just clearing reduntant span, br, and p tags, and such.
` elements. – Bergi Jan 14 '13 at 00:48