1

I'm trying to parse the source code from a website that goes like

   <html>
    ...
    ...
    ...
    <div class="menu_body">
    <a href="url" onclick="_gaq.push([parameters]);location.href=this.href+'?channelId=287&date='+dateOfMonth;return false;"> <img src="img"></a>
    <a href="url"><img src="img"></a>
    <a href="url"><img src="img"></a>
    </div>
    ...
    ...
    <div class="menu_body">
    <a href="url"><img src="img></a>
    <a href="url"><img src="img"></a>
    <a href="url"><img src="img"></a>
    </div>
    ...
    ...
    ...
    </html>

What I want to do if possible is to just grep out all the lines which contains channelId, not sure if it's possible?

Gary
  • 129
  • 1
  • 11
  • How are you currently parsing this HTML? Could we see your PHP code? – BoltClock Feb 09 '11 at 02:30
  • it would be alot easier to use JavaScript – KJYe.Name Feb 09 '11 at 02:34
  • and of course you have permission from this site to use their content? –  Feb 09 '11 at 02:38
  • Currently I'm just saving their source code into a variable using `file_get_contents`. And yes I have permission. Or is there a simpler way for me to grep all the lines which contain `channelId`? Sorta new to php and not sure what functions there are – Gary Feb 09 '11 at 02:41
  • By lines, you really mean lines? You really want the whole line, e.g. starting at position 0 ending with a newline character? You dont care about the HTML elements? – Gordon Feb 09 '11 at 08:07
  • possible duplicate of [Need to get line number in text file matching string](http://stackoverflow.com/questions/4926680/need-to-get-line-number-in-text-file-matching-string) – Gordon Feb 09 '11 at 08:12

2 Answers2

0

Sounds like you want some sort of HTML Parser

http://simplehtmldom.sourceforge.net/    <-- PHP
http://htmlparser.sourceforge.net/         <--- Java

/** @see Robust and Mature HTML Parser for PHP */

Community
  • 1
  • 1
Nick
  • 3,096
  • 3
  • 20
  • 25
0

Read the html as a string using CURL or file_get_contents, and then use preg_match. http://php.net/manual/en/function.preg-match.php

DhruvPathak
  • 42,059
  • 16
  • 116
  • 175