Indexing with preg_match

Question

I am trying to index a website and my preg_match returns an empty array.

This is what I have so far:

$content = get_content("www.something.com");
preg_match_all('#<span class="box_cod">Cod: ([0-9\.]*)</span><span class="box_pret">PRET: (.*)</span>#',$content,$Produs);

Where get_content is a curl function to retrieve the site.

Thank you!

It's very difficult to parse HTML with regular expressions. Have you considered using a real DOM parser? — Álvaro González, Mar 04 '13 at 12:38
Excellent... another opportunity to tell someone about [Tony The Pony](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)! I'll never tire of this. — SDC, Mar 04 '13 at 12:49

Max Muller · Accepted Answer · 2013-03-04T12:53:57.310

3

You may Use PHP Simple HTML DOM Parser to parse and get the site content in a variable.
For example first you include the php file..

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');

its easy than parsing HTML with regular expressions.

edited Mar 04 '13 at 12:53

answered Mar 04 '13 at 12:42

Max Muller

533
7
18

I never used simple html dom parser before and I am not familiar with it, that is why I tried using regular expressions. – Cristian Badea Mar 04 '13 at 12:45
Ok but I need with preg_match_all for now, If I only leave just the first one with Cod, the code works perfectly, if I add PRET then it doesn't work. – Cristian Badea Mar 04 '13 at 12:51
Thank you for nothing, i made it with 2 preg_match'es. – Cristian Badea Mar 04 '13 at 12:57

Indexing with preg_match

1 Answers1