Here is a regex that can get that substring.
var re = /<(?=[^<>]*\bclass="_type")div\b[^<>]*content="([^"]*)"/ig;
The regex is matching any <div>
containing string that has content="
and also containing class="_type"
. The result is stored in the captured group 1. Note that class="_type"
can be both before or after content="{our string}"
.
The code can be something like:
var re = /<(?=[^<>]*\bclass="_type")div\b[^<>]*content="([^"]*)"/ig;
var str = '<div content="[...]" class="[...]">[...]</div>\n<div content="website" [...] class="_type">[...]</div>\n<dic content="[...]" class="[...]">[...]</div>';
var m;
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
document.getElementById("r").innerHTML += m[1] + "<br/>";
}
<div id="r"/>
In case you do not know what kind of delimiters there will be in HTML, it makes it a bit more problematic. However, it is still possible:
var re = /<(?=[^<>]*\bclass=['"]?_type\b['"]?)div\b[^<>]*content=(?:["']([^<]*?)["']|(\S+))/ig;
var str = '<div content="[...]" class="[...]">[...]</div>\n<div content=\'[...]\' class=\'[...]\'>[...]</div>\n<div content="web site" [...] class="_type">[...]</div>\n<dic content="[...]" class="[...]">[...]</div>\n<dic content=[...] class=[...]>[...]</div>\n<dic content=\'[...]\' class=\'[...]\'>[...]</div>\n<div content=\'web site\' [...] class=\'_type\'>[...]</div>\n<div content=website [...] class=_type>[...]</div>';
var m;
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
if (m[1] === undefined) {
document.getElementById("e").innerHTML += m[2] + "<br/>";
}
else {
document.getElementById("e").innerHTML += m[1] + "<br/>";
}
}
<div id="e"/>