0

Is there a way to get the text in a dynamic way from a certain <tr> tag in the page?

e.g. I've a page with a <tr> with the value "a1". I'd like to get only the text from this <tr> tag, and echo it into the page. is this possible?

here is the HTML:

<html><tr  id='ieconn2' >
  <td><table width='100%'><tr><td valign='top'><table width='100%'><tr><td><script type="text/javascript"><!--
google_ad_client = "pub-4503439170693445";
/* 300x250, created 7/21/10 */
google_ad_slot = "7608120147";
google_ad_width = 300;
google_ad_height = 250;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script><br>When Marshall and Lily fear they will never get pregnant, they see a specialist who can hopefully help move the process along. Meanwhile, Robin starts her new job.<br><br><b>Source: </b>CBS

<br>&nbsp;</td></tr><tr><td><b>There are no foreign summaries for this episode:</b> <a href='/edit/shows/3918/episode_foreign_summary/?eid=1065002553&season=6'>Contribute</a></td></tr><tr><td><b>English Recap Available: </b> <a href='/How_I_Met_Your_Mother/episodes/1065002553?show_recap=1'>View Here</a></td></tr></table></td><td valign='top' width='250'><div align='left'>
<img  alt='How I Met Your Mother season 6 episode 13' src="http://images.tvrage.com/screencaps/20/3918/1065002553.jpg" width="248"  border='0' >
</div><div align='center'><a href='/How_I_Met_Your_Mother/episodes/1065002553?gallery=1'>6 gallery images</a></div></td></tr></table></td></tr><tr>
  <td background='/_layout_v3/buttons/title.jpg' height='39' width='631' align='center'>
<table width='100%' cellpadding='0' cellspacing='0' style='margin: 1px 1px 1px 1px;'>
<tr>
<td align='left'  style='cursor: pointer;' onclick="SwitchHeader('ieconn3','iehide3','26')"  width='90'>&nbsp;<span style='font-size: 15px;   font-weight: bold; color: black; padding-left: 8px;' id='iehide3'><img src='/_layout_v3/misc/minus.gif' width='26'></span></td>
<td align='center'  style='cursor: pointer;' onclick="SwitchHeader('ieconn3','iehide3','26')" ><h5 class='nospace'>Sponsored Links</h5><a name=''></a></td>

<td align='left' width='90' >&nbsp;</td></tr></table></td>
</tr></html>

All I want to get is this text: "When Marshall and Lily fear they will never get pregnant, they see a specialist who can hopefully help move the process along. Meanwhile, Robin starts her new job. "

BenV
  • 12,052
  • 13
  • 64
  • 92
Tom Granot
  • 1,840
  • 4
  • 22
  • 52

3 Answers3

3

How about this?

$dom = new DomDocument;
libxml_use_internal_errors(true);
$dom->loadHTMLFile(...); 
libxml_clear_errors();

$xpath = new DomXpath($dom);
$nodes = $xpath->query('/html/body/tr/td/table/tr/td/table/tr/td');
foreach ($nodes as $node)
{
  echo $node->nodeValue, "\n";
}
Gordon
  • 312,688
  • 75
  • 539
  • 559
ajreal
  • 46,720
  • 11
  • 89
  • 119
2

If I assume what you want to do right you could to the following:

$url = “http://url.tld”;
$str = file_get_contents($url);

and from there on just use php's string functions to cut away the parts you do not like (probably generate a regular expression to speed up the process).

If the above method does not work you can try a more complex function like this:

function get_url_contents($url){
    $crl = curl_init();
    $timeout = 5;
    curl_setopt ($crl, CURLOPT_URL,$url);
    curl_setopt ($crl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt ($crl, CURLOPT_CONNECTTIMEOUT, $timeout);
    $ret = curl_exec($crl);
    curl_close($crl);
    return $ret;
}
Cadoc
  • 251
  • 1
  • 6
  • @Cadoc-thanks! I'm not familiar with how to strip exact tags out of the string. I can strip all the tags from the output, but it won't give me only this part that I want, but the entire document. – Tom Granot Jan 07 '11 at 00:01
  • +0 Wrong tools for the job. This will make the OP's life more complicated than necessary – Gordon Jan 07 '11 at 08:47
  • @Gordon - He gave an answer. Did you? – Tom Granot Jan 07 '11 at 09:09
  • @WideBlade What does it matter if I did? – Gordon Jan 07 '11 at 09:22
  • @Gordon - Leave the critique aside, and let's get down to business. If you don't have anything concrete to help with, keep it to yourself. – Tom Granot Jan 08 '11 at 02:44
  • @WideBlade I think you misunderstood the concept of this site. We are allowed, even encouraged, to critique solutions by voting and commenting. – Gordon Jan 08 '11 at 09:48
  • @Gordon - Actually I do understand this site. But What You did was provide a vague critique without even solving the problem. That's annoying. – Tom Granot Jan 08 '11 at 19:53
  • @WideBlade If you feel my comment to this answer is vague, why not just ask me to explain it, instead of calling me annoying and unhelpful? Open your eyes. I gave a link to a list of DOM parsers right below your question. I also suggested various improvements to the accepted answer and even edited it, because it contained subpar code. The only reason I did not provide an answer on my own is because your question is a duplicate that gets asked daily and to which you could have easily found an answer if you just bothered to use the search function. But you didnt. And that is annoying. – Gordon Jan 08 '11 at 20:23
  • @Gordon - Emm... Where did you provide a link to a list of DOM parsers? – Tom Granot Jan 08 '11 at 20:40
  • @WideBlade yesterday. Last comment below your question. Click add/show 6 more comments. Also, both edits ajreal made to his answer were based on comments I made (and removed after ajreal's applied the edits) before you came back checking for answers. – Gordon Jan 08 '11 at 20:50
  • @WideBlade no problem. And to give an explanation as to why I think the above solution is will make your life more complicated: strings no nothing about a Document Object Model or HTML elements. Neither do Regular Expressions. You can teach them, but this is a extremely complicated if you want to do it right. So why do it if there is parsers readily available that can do so out of the box? You also wont need cURL, because at least DOM can be supplied a custom stream context that can do most of what curl can be configured with. – Gordon Jan 08 '11 at 21:12
1

Use queryPath http://querypath.org/. It's a jQuery for php.

egis
  • 1,404
  • 2
  • 11
  • 24