-1

I have a string pattern:

<div content="[...]" class="[...]">[...]</div>
<div content="website" [...] class="_type">[...]</div>
<dic content="[...]" class="[...]">[...]</div>

My question is how I can get the "website" text using code here.

I have tried:

/content="(.+?)".*?class="_type"/g

But the result is not expected: [...].

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
user3129040
  • 167
  • 1
  • 3
  • 12

1 Answers1

1

Here is a regex that can get that substring.

var re = /<(?=[^<>]*\bclass="_type")div\b[^<>]*content="([^"]*)"/ig; 

The regex is matching any <div> containing string that has content=" and also containing class="_type". The result is stored in the captured group 1. Note that class="_type" can be both before or after content="{our string}".

The code can be something like:

var re = /<(?=[^<>]*\bclass="_type")div\b[^<>]*content="([^"]*)"/ig; 
var str = '<div content="[...]" class="[...]">[...]</div>\n<div content="website" [...] class="_type">[...]</div>\n<dic content="[...]" class="[...]">[...]</div>';
var m;
 
while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    document.getElementById("r").innerHTML += m[1] + "<br/>";
}
<div id="r"/>
In case you do not know what kind of delimiters there will be in HTML, it makes it a bit more problematic. However, it is still possible:

var re = /<(?=[^<>]*\bclass=['"]?_type\b['"]?)div\b[^<>]*content=(?:["']([^<]*?)["']|(\S+))/ig; 
var str = '<div content="[...]" class="[...]">[...]</div>\n<div content=\'[...]\' class=\'[...]\'>[...]</div>\n<div content="web site" [...] class="_type">[...]</div>\n<dic content="[...]" class="[...]">[...]</div>\n<dic content=[...] class=[...]>[...]</div>\n<dic content=\'[...]\' class=\'[...]\'>[...]</div>\n<div content=\'web site\' [...] class=\'_type\'>[...]</div>\n<div content=website [...] class=_type>[...]</div>';
var m;
 
while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    if (m[1] === undefined) {
      document.getElementById("e").innerHTML += m[2] + "<br/>";
    }
  else {
      document.getElementById("e").innerHTML += m[1] + "<br/>";
    }
    
}
<div id="e"/>
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563