0

I'm using preg_match_all to get the quoted users from a post on a forum like so:

    preg_match_all('/quote author=(.*) link=/', $post, $quotedUsers);

The $post string will typically be something like:

[quote author=John link=topic=1234.msg123456#msg123456 date=1234567890]Lorem ipsum dolor sit amet[/quote]
Lorem ipsum dolor sit amet consectetur elit...

The preg_match_all function works fine when only one user is quoted, and returns something like:

Array
(
    [0] => Array
        (
            [0] => quote author=John link=
        )

    [1] => Array
        (
            [0] => John
        )

)

My code loops through each $quotedUsers[1] to get the usernames, and I thought everything was fine. Except, when two users are quoted, it looks more like this:

Array
(
    [0] => Array
        (
            [0] => quote author=Bob link=topic=1234.msg123456#msg13456 date=1234567890]Lorem ipsum dolor sit amet[/quote]

[quote author=John link=
        )

    [1] => Array
        (
            [0] => Bob link=topic=1234.msg123456#msg13456 date=1234567890]Lorem ipsum dolor sit amet[/quote]

[quote author=John
        )

)

What is going on and how do I fix this? I thought preg_match_all would just put all of the usernames into the $quotedUsers[1] array.

katoth
  • 73
  • 5
  • That's why you should [use a BBCode parser](http://stackoverflow.com/questions/488963/best-way-to-parse-bbcode "Best way to parse BBCode"), like suggested in [your previous question](http://stackoverflow.com/questions/3967228/php-and-regex-problem "PHP and Regex Problem"). [Regular expression cannot parse BBCode](http://kore-nordmann.de/blog/do_NOT_parse_using_regexp.html "Kore Nordmanns Blog: Do not parse HTML Using Regular Expressions") – Gordon Oct 19 '10 at 13:12

3 Answers3

0

On your regex you have to make * not greedy

'/quote author=(.*?) link=/'

Just have to add a ? after *

Viper_Sb
  • 1,809
  • 14
  • 18
0

Make the * non-greedy:

/quote author=(.*?) link=/

This will match any character until the next ) found. Otherwise it will match as many characters as possible (meaning it will match up to the last ) found).

More about this at Repetition with Star and Plus

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
0

The problem is that your current RegExp, with the .* is being greedy and grabbing too much content.

preg_match_all('\[quote author\=([^\]]+) link\=', $post, $quotedUsers);

Should do you.

Amended: Hopefully, usernames will not feature a square bracket...

Luke Stevenson
  • 10,357
  • 2
  • 26
  • 41
  • Won't that stop at spaces? Usernames can have spaces and symbols. – katoth Oct 19 '10 at 13:38
  • That's a detail which would have been handy in the original question... Anyway, my code has been amended. I can say that Viper & Felix's answers are also functional. – Luke Stevenson Oct 19 '10 at 13:43
  • If you want a `]` in a Regex expression the proper format is to have it immediately after the `[` or immediately after the `[^`. You should amend it to `([^]\]+)` – stevendesu Oct 19 '10 at 13:44
  • @steven_desu: Testing in RegexBuddy shows that, with or without the backslash, it performs the same. The reason I included it was to escape the square bracket, as I presumed it was a special character. – Luke Stevenson Oct 19 '10 at 13:59