3

In text like this:

<p>1 bla bla <em>bla</em> bla bla</p><p>2 bla bla <em>bla</em> bla TEXT bla</p><p>3 bla bla <em>bla</em> bla bla</p><p>4 bla bla <em>bla</em> bla TEXT bla</p><p>5 bla bla <em>bla</em> bla bla</p>

I have to find paragraphs (between p tags) that contain string "TEXT".
I tried <p>.*?(TEXT).*?<\/p>
and I tried <p>(?!<p>).*?(TEXT).*?<\/p>

But it doesnt solve the problem.

Rajesh
  • 24,354
  • 5
  • 48
  • 79
Shimon S
  • 4,048
  • 2
  • 29
  • 34

4 Answers4

3

((?!<\/p>).)*(TEXT) to make sure 'Text' in one <p></p>

See demo

var regex = /<p>((?!<\/p>).)*?(TEXT).*?<\/p>/g;
var text = '<p>1 bla bla <em>bla</em> bla bla</p><p>2 bla bla <em>bla</em> bla TEXT bla</p><p>3 bla bla <em>bla</em> bla bla</p><p>4 bla bla <em>bla</em> bla TEXT bla</p><p>5 bla bla <em>bla</em> bla bla</p>';
console.log(text.match(regex));
Kerwin
  • 1,212
  • 1
  • 7
  • 14
0

Since it is a string, (said by @Rajesh), just create a div element and appent to it.

get All the p tags using querySelectorAll and then use forEach function.

Check the innerHTML for /TEXT/ and if found, push it into array.

In the below program, the array a contains the 2 matching tags

var str="<p>1 bla bla <em>bla</em> bla bla</p><p>2 bla bla <em>bla</em> bla TEXT bla</p><p>3 bla bla <em>bla</em> bla bla</p><p>4 bla bla <em>bla</em> bla TEXT bla</p><p>5 bla bla <em>bla</em> bla bla</p>";
var div=document.createElement("div");
div.innerHTML=str;
var a=[];
div.querySelectorAll("p").forEach(x=>{if(/TEXT/.test(x.innerHTML)) a.push(x);});
console.log(a);

If you don't want the <p></p> tags, just push the textContent

var str="<p>1 bla bla <em>bla</em> bla bla</p><p>2 bla bla <em>bla</em> bla TEXT bla</p><p>3 bla bla <em>bla</em> bla bla</p><p>4 bla bla <em>bla</em> bla TEXT bla</p><p>5 bla bla <em>bla</em> bla bla</p>";
var div=document.createElement("div");
div.innerHTML=str;
var a=[];
div.querySelectorAll("p").forEach(x=>{if(/TEXT/.test(x.innerHTML)) a.push(x.textContent);});
console.log(a);
Rajesh
  • 24,354
  • 5
  • 48
  • 79
Sagar V
  • 12,158
  • 7
  • 41
  • 68
  • no. Someone downvoted mine too. @Rajesh Check this. is it ok now? – Sagar V Mar 31 '17 at 07:17
  • I have removed my vote as it produces correct output, but OP is looking for regex. – Rajesh Mar 31 '17 at 07:20
  • ah. is that you. Thanks. I appreciate downvotes with comments. But I really hate people who downvote without any comment. your comment make me improve my answer. – Sagar V Mar 31 '17 at 07:20
  • I think this is OP's problem _I have to find paragraphs (between p tags) that contain string "TEXT"._ @Rajesh – Sagar V Mar 31 '17 at 07:20
  • 1
    Yup. As said, *it produces correct output*, also glad to help in any way. :-) – Rajesh Mar 31 '17 at 07:22
0

Sometimes you just have to add a delimiter like (...) or {...} or /.../ or [...] So try it like this:

/<p>.*?(TEXT).*?<\/p>/

But as Barman pointed it out, this does not always stay within one paragraph. If you really want to only select 1 paragraph you need something like this:

(?:<p |<p>)(?:(?!\/p>).|\n)*(TEXT).*?<\/p>
  • (?:<p |<p>) Starts with <p or <p> and ?: in the beginning means "do not capture this in output"
  • (?:(?!\/p>).|\n)* Any character or newline .|\n except a closing /p> and ?: in the beginning means "do not capture this in output"
  • (TEXT) The word TEXT of course
  • .*? any character with lazy quantifier "?" .*? to stop on the shortest match (before we have a </p>
  • <\/p> And must end with closing </p> tag

And this one allows multiline text too!

Julesezaar
  • 2,658
  • 1
  • 21
  • 21
-2

You can try something like this:

  • Create a regex to get all groups
  • Loop over these regex to check for necessary search key and filter out the matches.

var str = "<p>1 bla bla <em>bla</em> bla bla</p><p>2 bla bla <em>bla</em> bla TEXT bla</p><p>3 bla bla <em>bla</em> bla bla</p><p>4 bla bla <em>bla</em> bla TEXT bla</p><p>5 bla bla <em>bla</em> bla bla</p>";

var groupRegex = /(?:^|<p>)(.*?)(?:<\/p>|$)/g;
var searchRegex = /text/i
var groups = str.match(groupRegex);

var result = groups.filter(function(s){ return searchRegex.test(s) })

console.log(result)
Rajesh
  • 24,354
  • 5
  • 48
  • 79