0

I have a function that creates a preview of a post like this

<?php $pos=strpos($post->content, ' ', 280);
echo substr($post->content,0,$pos ); ?>

But it's possible that the very first thing in that post is a <style> block. How can i create some conditional logic to make sure my preview writes what is after the style block?

Christopher Mellor
  • 470
  • 2
  • 9
  • 23
  • 2
    [Parse it as HTML](https://stackoverflow.com/q/3577641/476) and remove any HTML blocks you don't like/extract only pure text content. – deceze Nov 10 '18 at 01:45

3 Answers3

1

If the only HTML content is a <style> tag, you could just simply use preg_replace:

echo preg_replace('#<style>.*?</style>#', '', $post->content);

However it is better (and more robust) to use DOMDocument (note that loadHTML will put a <body> tag around your post content and that is what we search for) to output just the text it contains:

$doc = new DOMDocument();
$doc->loadHTML($post->content);
echo $doc->getElementsByTagName('body')->item(0)->nodeValue . "\n";

For this sample input:

$post = (object)['content' => '<style>some random css</style>the text I really want'];

The output of both is

the text I really want

Demo on 3v4l.org

Nick
  • 138,499
  • 22
  • 57
  • 95
0

Taking a cue from the excellent comment of @deceze here's one way to use the DOM with PHP to eliminate the style tags:

<?php

$_POST["content"] = 
"<style>
color:blue;
</style>
The rain in Spain lies mainly in the plain ...";

$dom = new DOMDocument;
$dom->loadHTML($_POST["content"]);
$style_tags = $dom->GetElementsByTagName('style');

foreach($style_tags as $style_tag) {

  $prent = $style_tag->parentNode;
  $prent->replaceChild($dom->createTextNode(''), $style_tag);

}

echo strip_tags($dom->saveHTML());


See demo here

I also took guidance from a related discussion specifically looking at the officially accepted answer.

The advantage of manipulating PHP with the DOM is that you don't even need to create a conditional to remove the STYLE tags. Also, you are working with HTML elements, so you don't have to bother with the intricacies of using a regex. Note that in replacing the style tags, they are replaced by a text node containing an empty string.

Note, tags like HEAD and BODY are automatically inserted when the DOM object executes its saveHTML() method. So, in order to display only text content, the last line uses strip_tags() to remove all HTML tags.

Lastly, while the officially accepted answer is generally a viable alternative, it does not provide a complete solution for non-compliant HTML containing a STYLE tag after a BODY tag.

slevy1
  • 3,797
  • 2
  • 27
  • 33
-3

You have two options.

  1. If there are no tags in your content use strip_tags()
  2. You could use regex. This is more complex but there is always a suiting pattern. e.g. preg_match()
Ben
  • 673
  • 3
  • 11