0

I need to parse this HTML code into an array of strings, so I can then add it into a database. Here is the HTML code I am parsing:

http://gyazo.com/eab3a140264d354060268a97ae8fa6de

The class "market_listing_table_header" at the top seems to be defining what the rest of the page will display. The class "market_listing_row_link" is 1 of 100, but I also get 40 more sets of lists of 100.

What I need is in this "Souvenir Desert Eagle | Hand Cannon (Well Worn)" section for each of these classes, which is in the "market_listing_item_name_block". The "result_0_name" goes from that to "result_100_name" then starts again at 0 for the ~4000 listings on the page.

If possible I would also like to get the src="get this link" in the "result_0_image" section to go with the "result_0_name".

This is the code I'm using now:

$str = '$html';
    $DOM = new DOMDocument;
    $DOM->loadHTML($str);

   $items = $DOM->getElementsByTagName('market_listing_item_block');

   //just displaying the items for now, for testing,
   //though I may need help putting the data in an array as well.
   for ($i = 0; $i < $items->length; $i++)
        echo $items->item($i)->nodeValue . "<br/>";

I have added different code in the "getElementsByTagName('???');" section, but I can't work out what it should be to get the section that I want. Any help would be great, thanks.

Mitch8910
  • 185
  • 1
  • 2
  • 15

1 Answers1

0

getElementsByTagName is the wrong function here. The tag name is something like a when you use an anchor (<a href="xy">...</a>). Instead, you need to look for the class market_listing_item_block.

Additionally, you need to use double quotes in the first line.

Based on this answer, the right code for you should be:

$str = "$html";
$DOM = new DOMDocument;
$DOM->loadHTML($str);

$finder = new DomXPath($DOM);
$items = $finder->query("//*[contains(@class, 'market_listing_item_name')]");

//just displaying the items for now, for testing,
//though I may need help putting the data in an array as well.
for ($i = 0; $i < $items->length; $i++)
    echo $items->item($i)->nodeValue . "<br/>";
Community
  • 1
  • 1
ByteHamster
  • 4,884
  • 9
  • 38
  • 53
  • Perhaps it's my loop to echo the results, but I'm not getting anything printed. Can you see an error in the for loop? – Mitch8910 Apr 19 '15 at 18:05
  • Oh, you used the wrong class in your code sample. `market_listing_item_block` should be `market_listing_item_name_block` - this is why I also used the wrong class. – ByteHamster Apr 19 '15 at 18:57
  • Oh yes, that was wrong. I just tried it with a few, I noticed there was actually another sub-class, so I have tried with "market_listing_item_name" and "market_listing_item_name_block", both with no yield. – Mitch8910 Apr 19 '15 at 19:12
  • `market_listing_item_name` is better, just tested the code. It is working fine. – ByteHamster Apr 19 '15 at 19:20
  • **I found your problem!** You are using `$str = '$html';` and it should be `$str = "$html";`. With single quotes, `$str` will simply be `"$html"` and not the **content** of the variable `$html`. – ByteHamster Apr 19 '15 at 19:21
  • I feel like a pain, but it's still not working. I'm not getting any errors, but it's also not echoing any of the values. Thanks for the help so far, I may open another question on this. – Mitch8910 Apr 20 '15 at 05:02
  • Are you sure that `$html` really contains the html you expect? – ByteHamster Apr 20 '15 at 09:32
  • Well, when I echo $html, the page shows a list of all the the item pictures and names. The screenshot I showed you is from Inspect Element>Html of that page with the $html echoed. – Mitch8910 Apr 20 '15 at 09:43