Assuming that your whole problematic <p>
tag is in a single line, you can use the following regex
((?!^)<p>)|(<\/p>(?!$))
(?!^)<p>)
matches all <p>
tags excluding the <p>
in the beginning of the string
(<\/p>(?!$)
matches all </p>
tags excluding the </p>
in the end of the string
you can just replace these captured <p>
and </p>
s with null and remove them.
Here is a working demo
EDIT:
Since your input is a html file you can try this updated regex
(<p>)((?!<\/p>).)*?(<p>).*?(<\/p>)
(<p>)
searches for <p>
tag
((?!<\/p>).)*?(<p>)
captures <p>
tag inside the first <p>
tag without any </p>
tag in between (nested <p>
tag)
.*?(<\/p>)
captures the closing tag of the nested <p>
.
just remove the capture groups 3 and 4 and you have removed the nested
tag. You need to run this again and again till there are no more matches.
you can find the updated regex demo here
UPDATE:
Use this regex (.*<p>)(((?!<\/p>).)*?)(<p>)(.*?)(<\/p>)(.*)
and replace it with \1\2\5\7
which will remove the nested tags alone.
Demo here
` tags are not valid and should not be used. That's why you are getting this weird behavior.
– Valdas Feb 21 '17 at 12:07tags . So we are in need to remove those nested tags with regex in PHP
– Angu Feb 21 '17 at 12:14