-1

Possible Duplicate:
How to parse and process HTML with PHP?

I am brand new to php, only a couple of hours in, trying to understand searching and finding. Let's say I want to extract the rank of Diablo 3 from Amazon's top seller list here. There I can search for the string "Diablo III" or similar to find the following block (sorry about the formatting):

 http://www.amazon.com/Diablo-III-Standard-Edition-Pc/dp/B00178630A/ref=zg_bs_4924894011_1
 "><img src="http://ecx.images-amazon.com/images/I/41kXCp%2BUyeL._SL160_SL160_.jpg" alt="Diablo III: Standard Edition" title="Diablo III: Standard Edition" onload="if (typeof      uet == 'function') { uet('af'); }"/></a></div></div><div class="zg_itemRightDiv_normal"><div class="zg_rankLine"><span class="zg_rankNumber">1.</span><span class="zg_rankMeta"></span></div><div class="zg_title"><a  href="

 http://www.amazon.com/Diablo-III-Standard-Edition-Pc/dp/B00178630A/ref=zg_bs_4924894011_1
 ">Diablo III: Standard Edition</a></div><div class="zg_byline">by Blizzard Entertainment

Now, I want to try to extract the rank, which is defined in this part <span class="zg_rankNumber">1.</span> and is currently 1.

Could someone please advise on the best way on extracting that number so that if it falls to second, third or whatever place (up until 20) I will still be able to extract it?

I have looked a bit into preg_match and regex but I couldn't quite understand the use.

Community
  • 1
  • 1
Krøllebølle
  • 2,878
  • 6
  • 54
  • 79
  • Maybe you could try something like [this](http://www.regextester.com/) to help you with your regex. However more than likely you should be using [DOM](http://www.php.net/dom) for this – Mike Jun 11 '12 at 19:57

2 Answers2

1
preg_match_all( '/<span class=\"zg_rankNumber\">(.*?)<\/span>/is',  $string, $matches );
print_r($matches)

it'll take a couple of hours for writing the exact code.. but i can tell you the logic

  1. Extract all "" from the html and store it in an array.
  2. Loop through the array and check for the title.
  3. If you found the title, extract the rank from that array element
Habeeb
  • 166
  • 13
  • This is somewhat what I want. However, I want only the rank of a given game, say Diablo. This code will return every occurance. I am looking for some way to distinguish between these occurences. – Krøllebølle Jun 11 '12 at 20:08
  • Strange, it doesn't work inline somehow. Anyway, here it is: http://www.amazon.com/Best-Sellers-Video-Games-PC-compatible/zgbs/videogames/4924894011/ref=zg_bs_4924894011_pg_1?_encoding=UTF8&pg=1 – Krøllebølle Jun 11 '12 at 20:15
  • Do you want to extract all the ranks along with respective game titles or rank of a given game title? – Habeeb Jun 11 '12 at 20:19
  • Only the rank of a given title. The title should be given as a string, e.g. "Diablo III" or "Diablo" or similar. – Krøllebølle Jun 11 '12 at 20:22
  • it'll take a couple of hours for writing the exact code.. but i can tell you the logic 1. Extract all "
    " from the html and store it in an array. 2. Loop through the array and check for the title. 3. If you found the title, extract the rank from that array element
    – Habeeb Jun 11 '12 at 20:36
1

You can start using Simple dom html parser So, if you wanna find this:

<span class="zg_rankNumber">

you can do it like this: ($str contains the html data)

$html = str_get_html($str);
echo $html->find("span[class='zg_rankNumber']",0)->innertext;

EDITED:

If you want to get a specific rank of game (Diablo III), then based on formatting, you just call:

echo $html->find("img[title^='Diablo III']",0)->find("span[class='zg_rankNumber']",0)->innertext;
  • Thank you, but this will give me all the occurances, right? As with Habeeb's answer I want to sort out like that for a given game (e.g. Diablo 3). So I want to do what you did here ONLY for a given game on that site. So search for game --> find rank. – Krøllebølle Jun 11 '12 at 20:21
  • No - this will give you the FIRST occurence (=0); if you wants all occurences of 'zg_rankNumber' you can cut the ",0" and handle the resulting array.. You can mix your own jquery-like selector as you like. –  Jun 11 '12 at 20:28