1

I found that syntax of preg_match() and the deprecated ereg() is different.
For example:

I thought that

preg_match('/^<div>(.*)</div>$/', $content);

means the same as

ereg('^<div>(.*)</div>$', $content);

but I was wrong. preg_match() doesn't include special characters as enter like ereg() does.

So I started to use this syntax:

preg_match('/^<div>([^<]*)</div>$/', $content);

but it isn't exactly the same to what I need.

Can anyone suggest me how to solve this problem, without using deprecated functions?

CSᵠ
  • 10,049
  • 9
  • 41
  • 64
igor
  • 21
  • 3
  • 1
    My suggestion is to use xml parser to work with HTML code instead of regex. – hsz Jan 10 '13 at 10:43
  • 1
    Reason is that preg is the Perl Compatible Regex library, ereg is the POSIX complient regex library. What exactly does not work? – axel.michel Jan 10 '13 at 10:44
  • 3
    You use / as delimiter so you have to escape all / chars with \. Ex: `/^
    (.*)<\/div>$/`
    – Fabien Sa Jan 10 '13 at 10:44
  • You can use something other than / to escape. Ex: `@^
    (.*)
    @`
    – cleong Jan 10 '13 at 10:48
  • my problem is that ereg('^anything.*anything$', 'anything123412345anything'); returns TRUE..... and preg_match('/^anything.*anything$/', 'anything123412345anything'); returns FALSE; ("" means special character for enter pressing...) i need functionality of ereg make with non deprecated php functions... – igor Jan 10 '13 at 10:54
  • sorry but preg_match('/^anything.*anything$/', 'anything123412345anything'); return true. – Fabien Sa Jan 10 '13 at 11:03
  • it doesn't :)... in my case represents special enter character.. can you understand me? I can't make enter char in this bloody comment window :) – igor Jan 10 '13 at 11:10
  • Ok but the dot take all characters, maybe it's multiple lines ? Try to add the "m" modifier like this : preg_match('/^anything.*anything$/m', 'anything123412345anything') or try also with "s" modifier. – Fabien Sa Jan 10 '13 at 11:25
  • Oh great! "s" modifier is exactly what i need. My fault i didn't check modifiers list before I wrote a question. Thanks Fab Sa for help! – igor Jan 10 '13 at 11:40

1 Answers1

1

For parsing HTML I'd suggest reading this question and choosing a built in PHP extension.

If for some reason you need or want to use RegEx to do it you should know that:

  • preg_match() is a greedy little bugger and it will try to eat your anything (.*) till it get's sick (meaning it hits recursion or backtracking limits). You change this with the U modifier1.

  • the engine expects to be fed a single line. You change this with the m or s modifiers1.

  • using your 'not a < character' ([^<]*) hack does a good job as it forces the engine to stop at the first < char, but will work only if the <div> doesn't contain other tags inside!

ref: 1 PCRE Pattern Modifiers

Community
  • 1
  • 1
CSᵠ
  • 10,049
  • 9
  • 41
  • 64