break, if following string is

Question

I want to find string after # , I have a problem if #a or #a <div>.. are both working to return a, but if #a<div>.. will return a<div>.

how to avoid if following string is <div> or <br> or <p> than just break, e.g
#a<div>bc - a

https://regex101.com/r/xD1vN0/1

var re = /#([^#\s]*)/g;

You mea `/#([^<#\s]*)/g`? Or should it be really `
`, a literal? Like, if you have `#a`, it would be accepted? — Wiktor Stribiżew, May 05 '16 at 10:34
Try [`#((?:(?!<(?:div|br|p)>)[^#\s])*)`](https://regex101.com/r/xI2wY9/3) — Wiktor Stribiżew, May 05 '16 at 10:37
Don't use regex for getting html tags' attributes and contents. Use a parser and parse the html! — Ram, May 05 '16 at 10:42
@Vohuman why can't use regex?? I was using https://www.npmjs.com/package/cheerio parse html content then find #tag but the input string is from contenteditable element so it will generate string like `#a
bc
#d` (only div, br, p html tag I guess) and after parse by cheerio text method `#abc #d`. or you have some example solution?? I want to doing on server side nodejs javascript, don't wanna on client side — user1575921, May 05 '16 at 11:50
base on @Vohuman comment, I found http://stackoverflow.com/questions/6751105/why-its-not-possible-to-use-regex-to-parse-html-xml-a-formal-explanation-in-la, but I think in this case string will not contain whole html tags only some generate from contenteditable element, so thats why I use regex doing this. any idea? — user1575921, May 05 '16 at 12:02

Wiktor Stribiżew · Accepted Answer · 2016-05-05T10:46:05.780

You can use a regex with a tempered greedy token:

/#((?:(?!<(?:div|br|p)>)[^#\s])*)/g

The (?:(?!<(?:div|br|p)>)[^#\s])* is a tempered greedy token that matches any character other than # and whitespace that do not start a sequence of either <div>, <br>, or <p>.

JS demo:

var re = /#((?:(?!<(?:div|br|p)>)[^#\s])*)/g; 
var str = `#a<div>
#b<br>
#c<p>
#d<hi>`;
var res = [];
 
while ((m = re.exec(str)) !== null) {
    res.push(m[1]);
}
document.body.innerHTML = "<pre>" + JSON.stringify(res.map(x => x.replace(/</g,"&lt;").replace(/</g,"&gt;")), 0, 4) + "</pre>";

score 0 · Answer 2 · answered May 05 '16 at 10:38

0

You can use this pattern to break if a < is reached

var re = /#([^#\s^<]*)/g;

answered May 05 '16 at 10:38

Simon Schüpbach

2,625
2
13
26

break, if following string is

2 Answers2