I have a textarea containing some markdown. I do not want users to post html in it, unless it is inside the markdown code block like
``` someLanguageCode
<span>some html inside markdown code block</span>
```
I do not want to allow any html outside the markdown code block. So this would be illegal:
<span>some html tag outside code block</span>
<div>some more multiline html code outside
</div>
``` someLanguageCode
<span>some html inside markdown code block</span>
```
I was able to get a regex for single line html tags. <([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>(.*?)<\/\1>
I am unable to
- get a regex that supports multi line html tags and
- to check whether that html is outside markdown code block.
I've made a jsfiddle to play around with this problem which shows what should match or should be rejected.
I'm doing this as an attempt to avoid obvious XSS injections.