reg-ex to scrape a price

Question

I am trying to scrape a price, im struggling writing some reg ex to grab the specific text

<option value="1">
                        1 


                                (£&nbsp;70)


                    </option>

Prices are pretty much displayed like the above in the source code with lots of white space. ideally I would like to grap the 70 from the string

this is what I have so far

preg_match("/<option value=\"1\">(.+)<\/option>/siU", $html, $matches);

I half expected this to grab 1(£ 70), but it did not work, any help?

You need to read the first answer to this infamous SO question: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags — Mark Thomas, Dec 16 '11 at 22:34

score 0 · Answer 1 · answered Jun 27 '11 at 11:35

0

/<option value=\"1\">(.*?\((.*?)\).*?<\/option>/

also make sure you're testing your string without newlines (pattern modifiers)

you may also want to consider using an xml parser.

answered Jun 27 '11 at 11:35

duedl0r

score 0 · Accepted Answer · answered Jun 27 '11 at 11:36

0

Well, it does match. Problem is (probably) that the match contains a bunch of whitespace characters:

string(97) "
                        1 


                                (£&nbsp;70)


                    "

Edit
You can do a little sanitizing:

$matches[1] = preg_replace('/\s+/s', ' ', trim($matches[1]));

Gives:

string(14) "1 (£&nbsp;70)"

answered Jun 27 '11 at 11:36

jensgram

is this an answer or a comment to my answer? :) – duedl0r Jun 27 '11 at 11:38
@duedl0r This is an answer to the question ... well, sort of. I was trying to show that the regex *did* actually match something, and subsequently tried to make the result match OPs expectations. – jensgram Jun 27 '11 at 11:43

2 Answers2