0

I'm parsing Wordpress post HTML through PHP. I want all images to be centered. This alone is easy enough, however, I also want images on the same line to be centered together. In order to do this I need to apply the attribute class="image-content" to the <p> block.

How do I do this with PHP?

This is what the post would look like in the editor:

enter image description here

And this is the HTML that Wordpress provides for this post:

<p>Single line paragraph.</p>
<p>
    <a href="image.png">
        <img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="150" height="150" />
    </a>
</p>
<p>
    Multi line paragraph which is a multi line paragraph 
    which is a multi line paragraph which is a multi line 
    paragraph which is a multi line paragraph which is a 
    multi line paragraph which is a multi line paragraph 
    which is a multi line paragraph which is a multi line
     paragraph which is a multi line paragraph which is a 
     multi line paragraph which is a multi line paragraph 
     which is a multi line paragraph which is a multi line 
     paragraph which is a multi line paragraph which is a 
     multi line paragraph which is a multi line paragraph.
</p>
<p>
    <a href="image.png">
        <img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="150" height="150" />
    </a>
    <a href="image.png">
        <img class="alignnone wp-image-39 size-thumbnail" src="image.png" width="300" height="300" />
    </a>
</p>
<p>End of post.</p>
Joe Shanahan
  • 816
  • 5
  • 21
  • What are you using to parse the content? – Akshay Khetrapal Jun 19 '15 at 22:50
  • Since you've already tagged your question with the *dom* tag: What have you already tried? Did you try out any of the PHP DOM libraries? – Hauke P. Jun 19 '15 at 22:50
  • I'm not *currently* doing anything, I originally tried to use DOM to look at every `

    ` block, detect the content and apply an appropriate style but I couldn't work it out. I assumed this would use DOM which is why I tagged it as such.

    – Joe Shanahan Jun 19 '15 at 23:02

1 Answers1

1

You can do this with DOMDocument, xpath and a simple replacement.

    $parse = new \DOMDocument();
    $parse->loadHTML($html);

    $xpath = new \DOMXpath($parse);
    $images = $xpath->query('//p//img');

    $re = "/(.*)/";
    $subst = "$1 image-content";

    foreach ($images as $image) {
        $class = preg_replace($re, $subst, $image->getAttribute('class'), 1);
        $image->setAttribute('class',$class);
    }

    $htmlFinal = $parse->saveHTML();

EDIT

If you want to attach the class to the containing p Element, you can use it like this:

    $parse = new \DOMDocument();
    $parse->loadHTML($html);

    $xpath = new \DOMXpath($parse);
    $ps = $xpath->query('//p');

    foreach ($ps as $p) {
        if ($p->getElementsByTagName('img')->length > 0) $p->setAttribute('class', 'image-content');           
    }

    $htmlFinal = $parse->saveHTML();

If the p tags may have a class set before parsing the Dom, you should combine those two examples to add the new class instead of only setting it.

baao
  • 71,625
  • 17
  • 143
  • 203
  • There won't always be links attached to the images, is that going to be a problem? – Joe Shanahan Jun 19 '15 at 23:11
  • Sorry, edited the answer, no it will find all images within a `p` tag – baao Jun 19 '15 at 23:11
  • @Joe I have edited my anser to add a solution which will attach the class to the containing `p` element – baao Jun 19 '15 at 23:31
  • Hey, thanks. This *sort-of* worked. `saveHTML()` seems to wrap the HTML in `` and `` tags, as well as a doctype. I had to do ` $parse->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);` to have it not include those, as suggested here: http://stackoverflow.com/questions/4879946/how-to-savehtml-of-domdocument-without-html-wrapper Was that intended in your answer? – Joe Shanahan Jun 20 '15 at 02:44
  • Also, `class="image-content"` is being set on the first `

    ` of every post, any idea why that's happening?This is the current code: http://pastebin.com/5i2yhiDp

    – Joe Shanahan Jun 20 '15 at 02:53
  • Ok this is very strange, the first

    element seems to register as containing **all** images in the post!

    – Joe Shanahan Jun 20 '15 at 03:00