-1

I have text that looks like this or a billion variant of this, for example:

 <div>content goes here... </div><div style="some style..."><span style="some styles..."><strong>[END_CONTACT]</strong></span></div><div>content goes here... </div>
 <div>content goes here... </div><div style="other style..."><span style="other styles..."><strong>[END_CONTACT]</strong></span></div><div>content goes here... </div>
 <div>content goes here... </div><div style="random stuff..."><span style="random stuff..."><strong>[END_CONTACT]</strong></span></div><div>content goes here... </div>
 and a billion variations of this...

I want to be able to remove any variation of the text surrounding [END_CONTACT] so that all I am left with this is this:

 <div>content goes here... </div><div>[END_CONTACT]</div><div>content goes here... </div>

How do I strip the content between the opening div tag and [END_CONTACT] and the content between [END_CONTACT] and the ending div tag?

Thanks

LargeTuna
  • 2,694
  • 7
  • 46
  • 92

2 Answers2

0

How do I strip the content between the opening div tag and [END_CONTACT] and the content between [END_CONTACT] and ending div tag?

If the terms [END_CONTACT] and the <div> tag are always present, you can use PCRE REGEX in preg_replace():

$string = preg_replace('/<div[^>]*>.*\[END_CONTACT\].*<\/div>/i','<div>[END_CONTACT]</div>',$string);

Example:

$data = [];
$data[] =  'some text <div style="some style..."><span style="some styles..."><strong>[END_CONTACT]</strong></span></div>';
$data[] = 'somrthing else etc.<div style="other style..."><span style="other styles..."><strong>[END_CONTACT]</strong></span></div>';
$data[] = '<div style="random stuff..."><span style="random stuff..."><strong>[END_CONTACT]</strong></span></div>';
$data[] = 'and a billion variations of this...';

foreach ($data as $row){

     $string = preg_replace('/<div[^>]*>.*\[END_CONTACT\].*<\/div>/i','<div>[END_CONTACT]</div>',$row);
     print $string."<BR>";

}

Output:

 <div>[END_CONTACT]</div>
 <div>[END_CONTACT]</div>
 <div>[END_CONTACT]</div>
 and a billion variations of this...

UPDATE:

Sorry, wasn't clear about that in my original post. Is there any way to keep text or code outside of the string in question but still do the operation as you've suggested?

Try this Regex in the above PHP code:

 (?!<div).(<div[^>]*>.*\[END_CONTACT\][^\div]*<\/div>)

Example:

 content content content... <div style="random stuff..."><span style="random stuff..."><strong>[END_CONTACT]</strong></span></div> content content content

Output:

  content content content... <div>[END_CONTACT]</div> content content content     

NOTE:

It must be stated that you should use a DOM parser to work with HTML elements in complex compositions rather than Regex.

I have tested my answer and it does what is desired. And as stated above, what you should be using to deal with multilayered complex HTML is a proper PHP DOM Parser.

Martin
  • 22,212
  • 11
  • 70
  • 132
  • This works great except if there is text outside of the
    tags - it will stip all of that away. Sorry, wasn't clear about that in my original post. Is there any way to keep text or code outside of the string in question but still do the operation as you've suggested? E.g. content content content...
    [END_CONTACT]
    content content content...
    – LargeTuna Mar 16 '21 at 18:02
  • Here's the problem im running into, if the
    [END_CONTACT]
    has any div's after it, it will not find and replace the correct content, it just skips over it. E.g. this bombs out: content content content...
    [END_CONTACT]
    content content content
    It just returns back: content content content...
    [END_CONTACT]
    content content content
    – LargeTuna Mar 16 '21 at 21:03
  • @LargeTuna you have over 2000 reputation on Stack Overflow it would be assumed that you would be aware that you need to set out *all* the criteria of the question in the question, which saves everyones time. Please **clearly** update your question with you **exact** criteria and what you've tried to resolve this. [**I have tested my answer**](http://sandbox.onlinephpfunctions.com/code/bbc8b854cbb77c9cfe6f86637b14da4e57ed024f) and it does what is desired. And as stated, what you *should* be using to deal with multilayered complex HTML is a proper PHP DOM Parser – Martin Mar 16 '21 at 22:21
0

Use regular expressions! The following example using preg_replace will work as long as your content doesn't contain angle brackets, which you should not put in HTML.

$result = preg_replace('#<div\b[^>]*><span\b[^>]*><strong\b[^>]*>([^<]*)</strong></span></div>#i', '<div>$1</div>', $html);
PHP Guru
  • 1,301
  • 11
  • 20