Get images from article in Joomla with php

Question

I'm trying edit a plugin which I use to add meta open graph tags to the header. The problem with it is that it would only let me choose one picture for the whole site.. this is what I've done:

preg_match_all('/<img .*?(?=src)src=\"([^\"]+)\"/si', $hdog_base, $image);

if (strlen($hdog_base) <= 25) 
{
    if (substr($image[0], 0, 4) != 'http') 
    {
        $image[0] = JURI::base().$image[0]; 
    }
    $hdog_image_tmp = $image[0];
}
else
{
    if (substr($image[1], 0, 4) != 'http') 
    {
        $image[1] = JURI::base().$image[1]; 
    }
    $hdog_image_tmp = $image[1];
}
$hdog_image =   '<meta property="og:image" content="'.$hdog_image_tmp.'" />
';

$hdog_base is the current webpage I'm on. The first if-statement would show the very first picture, which is the logo (used for ex. homepage), and the else would show the second picture (which would be different on each page), but the result only comes out like this, no matter if I'm on the homepage or anywhere else on the site:

<meta property="og:image" content="http://mysite.com/Array" />

Any suggestions?

Thanks in advance,

Update: The biggest fault I'm making is that I am trying to find the images in a url, not the actual webpage. But just the link. So how would I go on to get the contents of the current page in a string? Instead of $hdog_base, which is nothing but a link.

UPDATE, SOLVED:

I used

$buffer = JResponse::getBody();

to get the webpage in HTML

and then DOM for the rest

$doc = new DOMDocument();
@$doc->loadHTML($buffer);

$images = $doc->getElementsByTagName('img');
if (strlen($hdog_base) <= 26) 
{
    $image = $images->item(0)->getAttribute('src');
} 
else 
{
    $image = $images->item(1)->getAttribute('src');
}
if (substr($image, 0, 4) != 'http') $image = JURI::base().$image;
$hdog_image =   '<meta property="og:image" content="'.$image.'" />
';

Thanks a lot cpilko for your help! :)

cpilko · Accepted Answer · 2012-10-18T01:51:51.523

3

Using preg_match_all with more than one subpattern in the regular expression will return a multidimensional array. In your code $image[n] is an array. If you cast an array as a string in php, as you are doing it returns the text Array.

EDIT: Using a regex to parse HTML isn't ideal. You are better off doing it with DOMDocument:

$doc = new DOMDocument();
@$doc->loadHTML($hdog_base);

$images = $doc->getElementsByTagName('img');
if (strlen($hdog_base) <= 25) {
    $image = $images->item(0)->getAttribute('src');
} else {
    $image = $images->item(1)->getAttribute('src');
}
if (substr($image[0], 0, 4) != 'http') $image .= JURI::base();
$hdog_image =   '<meta property="og:image" content="'.$hdog_image_tmp.'" />
';

edited Oct 18 '12 at 01:51

answered Oct 17 '12 at 17:52

cpilko

11,792
2
31
45

The result was this: ` ` – joakim.g Oct 17 '12 at 17:58
Your regex isn't matching anything then. You can troubleshoot this in an online regex tester like this one: http://www.regextester.com/ – cpilko Oct 17 '12 at 18:03
On doing a little more research, a regex is the wrong tool for the job. You should be using `DOMDocument`. See the 2nd and 3rd answers of SO question for details http://stackoverflow.com/questions/138313/how-to-extract-img-src-title-and-alt-from-html-using-php – cpilko Oct 17 '12 at 18:09
Thanks I tried your code, but the result I get is a fatal error: Fatal error: Cannot use object of type DOMNodeList as array on line 45. Which is this: $image = $images[1]->getAttribute('src'); – joakim.g Oct 17 '12 at 18:29
Getting a different fatal error: Fatal error: Call to a member function getAttribute() on a non-object on line 48. This code: $image = $images->item(1)->getAttribute('src'); – joakim.g Oct 18 '12 at 10:15
Are you sure there is HTML in `$hdog_base`? This code works for me when I run this code with `$hdog_base = file_get_contents('local_file.html');` – cpilko Oct 18 '12 at 12:21
The $hdog_base is the link to the current webpage (http://www.mysite.com/...), I think I wasn't clear enough about that in the original post, sorry about that. – joakim.g Oct 18 '12 at 12:33
To do what you're trying to do, you need to have your HTML in that variable. You need to be working with a Joomla function that gives you this information pre-render, rather than scraping your own site to get it. – cpilko Oct 18 '12 at 12:46
The problem is that I am completely new to Joomla API, I have no idea how to do that, any suggestions? – joakim.g Oct 18 '12 at 12:48
Get one of these, http://www.packtpub.com/books/joomla%21 or open a new question where you ask "How can I get these images from Joomla" and explain where they exist in the system. – cpilko Oct 18 '12 at 13:00

Get images from article in Joomla with php

1 Answers1