I'm grabbing data from a published google spreadsheet, and all I want is the information inside of the content div (<div id="content">...</div>
)
I know that the content starts off as <div id="content">
and ends as </div><div id="footer">
What's the best / most efficient way to grab the part of the DOM that is inside there? I was thinking regular expression (see my example below) but it is not working and I'm not sure if it that efficient...
header('Content-type: text/plain');
$foo = file_get_contents('https://docs.google.com/spreadsheet/pub?key=0Ahuij-1M3dgvdG8waTB0UWJDT3NsUEdqNVJTWXJNaFE&single=true&gid=0&output=html&ndplr=1');
$start = '<div id="content">';
$end = '<div id="footer">';
$foo = preg_replace("#$start(.*?)$end#",'$1',$foo);
echo $foo;
UPDATE
I guess another question I have is basically about if it is just simpler and easier to use regex with start and end points rather than trying to parse through a DOM which might have errors and then extract the piece I need. Seems like regex would be the way to go but would love to hear your opinions.