0

I´m parsing some itunes links with dom parser in php. With most of the links it works perfectly. Others which are totally the same type it doesn`t?! I need the "img" tag and the "src-swap-high-dpi" attribute. It drives me nuts. That´s a part of my php-code

$url = "https://itunes.apple.com/us/podcast/id278981407";
$htmlContent = str_get_html(file_get_contents($url));

foreach ($htmlContent->find("img") as $element) {
$value  = $element->getAttribute("src-swap-high-dpi");
echo $value;
}

So e.g. I can parse the following links: https://itunes.apple.com/us/podcast/id201671138

https://itunes.apple.com/us/podcast/id523121474

https://itunes.apple.com/us/podcast/id152249110

But this e.g. not:

https://itunes.apple.com/us/podcast/id278981407

I do not get any output.

Edit:

New Code doesnt work as well:

Still not working for me. Very strange. Thats my new complete code now:

 <?php
 ini_set("display_errors",1); error_reporting(E_ALL);
 require_once ('simple_html_dom.php');

 $url = "https://itunes.apple.com/us/podcast/id278981407";

 $htmlContent = str_get_html(file_get_contents($url));


foreach($htmlContent->find("div.artwork") as $div) {
 $value = $div->find("img",0)->getAttribute("src-swap-high-dpi");
 echo $value."<br/>";
 }

?>

I get the Output:

Fatal error: Call to a member function find() on a non-object in /home/www/whatever/delete.php on line 10

line 10 is the line starting with "foreach". Your code works fine with the links provided above which I declared as working. But as soon as I take one of the designated one which doesnt work I get the error message provided above. ?!

tyler
  • 424
  • 4
  • 16

1 Answers1

1

I think this is one of the cases Simple DOM gets a bit confused and you need to provide it with a parent:

$url = "https://itunes.apple.com/us/podcast/id278981407";
$htmlContent = str_get_html(file_get_contents($url));
foreach($htmlContent->find("div.artwork") as $div) {
   $value = $div->find("img",0)->getAttribute("src-swap-high-dpi");
   echo $value."<br/>";
}

UPDATE

Here are the results using the above fragment:

http://a3.mzstatic.com/us/r30/Podcasts/v4/61/cc/7f/61cc7f25-131f-7616-6549-5553e6444b87/mza_7489225285918350214.150x150-75.jpg
http://a2.mzstatic.com/us/r30/Podcasts6/v4/04/a9/64/04a964d7-7c10-72d6-871b-97619cf89066/mza_1416781107029663068.150x150-75.jpg
http://a5.mzstatic.com/us/r30/Podcasts4/v4/bb/a6/f4/bba6f4b6-eeab-d7d9-8591-adb2bd277ccb/mza_5223368352447971673.150x150-75.jpg
http://a1.mzstatic.com/us/r30/Podcasts5/v4/aa/54/16/aa541600-cc8b-772b-9c0a-824efe8fdc42/mza_6772270613386652594.150x150-75.jpg
http://a2.mzstatic.com/us/r30/Podcasts3/v4/95/3d/2f/953d2f75-c2c2-4815-a752-f30fdcc0b9fb/mza_9037746738018570312.150x150-75.jpg
http://a4.mzstatic.com/us/r30/Podcasts4/v4/a2/1c/f5/a21cf5a4-2d8d-1ed7-983f-1c90f2f4f948/mza_7120473049241631392.340x340-75.jpg
http://a2.mzstatic.com/us/r30/Podcasts4/v4/5d/21/8d/5d218d2a-2980-0ac9-0bc7-9321ea6eb334/mza_6358466742996313573.150x150-75.jpg
http://a1.mzstatic.com/us/r30/Podcasts/b2/bb/bf/ps.ykmejwzs.150x150-75.jpg
http://a4.mzstatic.com/us/r30/Podcasts6/v4/17/ea/31/17ea3187-ef8c-4756-e488-0c65adced988/mza_7931750363714403933.150x150-75.jpg
http://a1.mzstatic.com/us/r30/Podcasts2/v4/0b/3c/7d/0b3c7d2b-19bf-f7a2-7c50-ca15338b8316/mza_2792239161425784587.150x150-75.jpg

Can you verify you're not getting errors at all ? Say, just write some weird characters in your PHP file, does the PHP shows the error? If not, try to add this in your .htaccess file.

<IfModule mod_php5.c>
   # do not display errors
   php_value display_errors 1
</IfModule>

UPDATE 2

$url = "https://itunes.apple.com/us/podcast/id278981407";

$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,FALSE);
$html = curl_exec($ch);
curl_close($ch);

//$htmlContent = str_get_html(file_get_contents($url));
$htmlContent = str_get_html($html);
foreach($htmlContent->find("div.artwork") as $div) {
   $value = $div->find("img",0)->getAttribute("src-swap-high-dpi");
   echo $value."<br/>";
}

The reason i didn't use file_get_html of Simple Dom is because it simply uses file_get_contents internally.

  • I dont get any output. – tyler Sep 15 '14 at 05:19
  • is your error reporting on? Put `ini_set("display_errors",1); error_reporting(E_ALL);` in top of your script. –  Sep 15 '14 at 05:21
  • I did. No error at all. Have you tried to parse the given id-number? It doesn´t work :( The strange thing is also that the sciript is not going to continue. It seems as if it stucks in the foreachloop. – tyler Sep 16 '14 at 01:30
  • In fact i tested my code exactly as it is and i really got results. Check my _updated answer_ to check for what i got. –  Sep 16 '14 at 02:03
  • This is really weird since i've used your problematic url for producing the above results. May i ask what version of Simple DOM parser do you have? mine is _1.11_ (not the latest). Open the Simple DOM file and find the `* @version x.xx ($Rev: 184 $)` lines, i want the _x.xx_ –  Sep 16 '14 at 03:24
  • My version is 1.5 (* @version 1.5 ($Rev: 196 $)). This problem really drives me nuts. I think the problem has something to do with the "str_get_html" method. When I try to echo it I dont get anything. But with podcasts-id which normally work this method does return the site (looks like the site is opened.) The "file_get_content" returns even the id which normally doesnt work. So the culprit must be the "str_get_html"-method. The error message mentioned above indicates that "htmlContent" could be empty, so maybe "str_get_html" is not excecuted?! I even tried it on a differend computer. – tyler Sep 16 '14 at 03:40
  • So the `file_get_contents($url)` might return empty with the problematic url? This a bit weird since it works with other links, otherwise i would say that [allow_url_fopen](http://php.net/manual/en/filesystem.configuration.php#ini.allow-url-fopen) is disabled. Nevertheless, check my even updated answer for a `cURL` solution, bypassing the SSL verification. –  Sep 16 '14 at 03:54
  • _A side note:_ Just out of curiosity, i downloaded the _1.5_ latest of Simple DOM and indeed produces the above error. It seems it's a bug from latest version which is not produced in my older version :) You can downgrade to _1.11_ of course and compare. –  Sep 16 '14 at 04:01
  • awesome. I go and check this...YES! AWESOME! I downgraded to 1.11 now it works with your code vlzvl! Thank you so much! – tyler Sep 16 '14 at 04:07