1

preg_match_all("/\(.*?)\</a>/",$this->page["Title"],$matches);

Guys, $this->page["Title"] is the contents of a page like http://uk.imdb.com/title/tt1285016/ . I need to get the list of genres associated with the movies i.e. [Action | Drama | Sci-Fi]

I dont know any php or anything about regular expressions. I always hated pattern matching

Help here will be really appreciated. Thx.

Point : This is an existing code which I need to Modify. This is in php.

Umashankar Das
  • 601
  • 4
  • 12
  • related : http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Lix Feb 03 '12 at 09:15
  • I guess this is another 'write my regex' question. First try to extract the exact block you need with the genres. At least try! The regex you provided looks broken, and even if it was correct it would match any string that ends with . – Evert Feb 03 '12 at 09:16
  • I cant post the exact regular expression. The parser is not taking it. – Umashankar Das Feb 03 '12 at 09:38

3 Answers3

2

This should work better:

preg_match_all("@<a href\=\"/genre/[\w\-]+\"[^>]*\>(.*?)\</a>@",$this->page["Title"],$matches)
Imtiaz
  • 2,484
  • 2
  • 26
  • 32
1

Try this

preg_match_all('#/genre/[^>]+>([^<]+)<#',$this->page["Title"],$matches);
Rezigned
  • 4,901
  • 1
  • 20
  • 18
0

You should try using one of the many PHP HTML parsers.

In particular you should look at the PHP native DOMDocument documentation.


Finally - as I posted in the comment above - parsing HTML with regular expressions is a touchy subject - follow the link to learn more :)

RegEx match open tags except XHTML self-contained tags

Community
  • 1
  • 1
Lix
  • 47,311
  • 12
  • 103
  • 131