-1

How can I get the "85 mph" from this html code with PHP + Regex ? I couldn't come up with right regex This is the code http://pastebin.com/ffRH9K9Q

    <td align="left">Los Angeles</td>
</tr>
<tr>
    <td align="left">Wind Speed:</td>
    <td align="left">85 mph</td>
</tr>
<tr>
    <td align="left">Snow Load:</td>
    <td align="left">0 psf</td>

(simplified example)

hakre
  • 193,403
  • 52
  • 435
  • 836
SNaRe
  • 1,997
  • 6
  • 32
  • 68
  • 3
    I think this is relevant: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – tskuzzy Jun 19 '12 at 21:25
  • It doesn't give any results as array. I will share everything that I've tried. – SNaRe Jun 19 '12 at 21:26
  • 1
    Your example (malformed) markup is a key reason why you can't reliably parse HTML with a Regex. – Jason McCreary Jun 19 '12 at 21:26
  • 3
    @tskuzzy The zalgo thing is funny, but totally unhelpful to an inexperienced user who may never have heard of DOM parsing and doesn't understand why they're getting made fun of. That link really needs a companion link to a primer on better parser options. – octern Jun 19 '12 at 21:27
  • possible duplicate of [How to extract img src, title and alt from html using php?](http://stackoverflow.com/questions/138313/how-to-extract-img-src-title-and-alt-from-html-using-php) – Gordon Jun 19 '12 at 21:29
  • 1
    @octern that would probably be http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662 – Gordon Jun 19 '12 at 21:30

3 Answers3

2

You've heard already about not using regex for the job, so I won't talk about that. Let's try something here. Perhaps not the ideal solution, but could work for you.

    <?php
       $data = 'your table';
       preg_match ('|<td align="left">(.*)mph</td>|Usi', $data, $result);
       print_r($result);  // Your result shoud be in here

You could need some trimming or taking whitespaces into account in the regex.

Robert
  • 1,899
  • 1
  • 17
  • 24
0

The first comment that links to the post about NOT PARSING HTML WITH REGEX is important. That said, try something like DOMDocument::loadHTML instead. That should get you started traversing the DOM with PHP.

DorkRawk
  • 732
  • 2
  • 6
  • 21
0

To expand on DorkRawk's suggestion (in the hope of providing a relatively succinct answer that isn't overwhelming for a beginner), try this:

<?php

$yourhtml = '<td align="left">Los Angeles</td>
</tr>
<tr>
    <td align="left">Wind Speed:</td>
    <td align="left">85 mph</td>
</tr>
<tr>
    <td align="left">Snow Load:</td>
    <td align="left">0 psf</td>';

$dom = new DOMDocument();
$dom->loadHTML($yourhtml);

$xpath = new DOMXPath($dom);
$matches = $xpath->query('//td[.="Wind Speed:"]/following-sibling::td');

foreach($matches as $match) {
    echo $match->nodeValue."\n\n";
}
lucideer
  • 3,842
  • 25
  • 31