2

Once again I'm stuck at regular expression. There is nowhere any good material where to learn the more advance usage.

I'm trying to match [image width="740" height="249" parameters=""]51lca7dn56.jpg[/image] to $cache->image_tag("$4", $1, $2, "$3").

Everything works great if all the [image] parameters are there, but I need it to match, even if something is missing. So for example [image width="740"]51lca7dn56.jpg[/image].

Current code is:

$text = preg_replace('#\[image width=\"(.*?)\" height=\"(.*?)\" parameters=\"(.*?)\"\](.*?)\[/image\]#e', '$cache->image_tag("$4", $1, $2, "$3")', $text);

Regular expression is the only thing that always gets me stuck, so if anybody could also refer some good resource, so I could manage these types of issues myself, it would be much appreciated.

My dummy version what I'm trying to do is this:

// match only [image]
$text = preg_replace('#\[image\](.*?)\[/image\]#si', '$cache->image_tag("$1", 0, 0, "")', $text);
// match only width
$text = preg_replace('#\[image width=\"(.*?)\"\](.*?)\[/image\]#si', '$cache->image_tag("$2", $1, 0, "")', $text);
// match only width and height
$text = preg_replace('#\[image width=\"(.*?)\" height=\"(.*?)\"\](.*?)\[/image\]#si', '$cache->image_tag("$3", $1, $2, "")', $text);
// match only all
$text = preg_replace('#\[image width=\"(.*?)\" height=\"(.*?)\" parameters=\"(.*?)\"\](.*?)\[/image\]#si', '$cache->image_tag("$4", $1, $2, $3)', $text);

(This code actually doesn't work as expected, but you will understand my point more better.) I hope to put all this horrible mess into one RE call basically.

Final code tested and working based on Ωmega's answer:

// Match: [image width="740" height="249" parameters="bw"]51lca7dn56.jpg[/image]
$text = preg_replace('#\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]#si', '$cache->image_tag("$4", $1, $2, "$3")', $text); // the end is #si, so it would be eaiser to debug, in reality its #e

However, since if width or height might not be there, it will return empty not NULL. So I adopted drews idea of preg_replace_callback():

$text = preg_replace_callback('#\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]#', create_function(
'$matches',
'global $cache; return $cache->image_tag($matches[4], ($matches[1] ? $matches[1] : 0), ($matches[2] ? $matches[2] : 0), $matches[3]);'), $text);
Kalle H. Väravas
  • 3,579
  • 4
  • 30
  • 47
  • 1
    * See also [Open source RegexBuddy alternatives](http://stackoverflow.com/questions/89718/is-there) and [Online regex testing](http://stackoverflow.com/questions/32282/regex-testing) for some helpful tools, or [RegExp.info](http://regular-expressions.info/) for a nicer tutorial. – mario Jul 27 '12 at 00:07
  • @Mario, thank you so much for the useful links! I'm not surprised, if I can answer my own question soon :) – Kalle H. Väravas Jul 27 '12 at 00:13
  • You question is confusing - you want to parse `[image]` or `[uploads]`..? – Ωmega Jul 27 '12 at 00:16
  • @Ωmega, sorry, while testing what works and what doesn't I had to Undo million times, so the codes got mixed up. I'm trying to parse [image], specifically the parameters that could or not could be there. – Kalle H. Väravas Jul 27 '12 at 00:20

2 Answers2

3

Maybe try a regex like this instead which tries to grab extra params in the image tag (if any). This way, the parameters can be in any order with any combination of included and omitted parameters:

$string = 'this is some code and it has bbcode in it like [image width="740" height="249" parameters=""]51lca7dn56.jpg[/image] for example.';

if (preg_match('/\[image([^\]]*)\](.*?)\[\/image\]/i', $string, $match)) {
    var_dump($match);
}

Resulting match:

array(3) {
  [0]=>
  string(68) "[image width="740" height="249" parameters=""]51lca7dn56.jpg[/image]"
  [1]=>
  string(39) " width="740" height="249" parameters="""
  [2]=>
  string(14) "51lca7dn56.jpg"
}

So you can then examine $match[1] and parse out the parameters. You may need to use preg_replace_callback to implement the logic inside the callback.

Hope that helps.

drew010
  • 68,777
  • 11
  • 134
  • 162
  • Drew, thank you very much for the answer and help, I accepted Ωmega's answer because of the simplified regex. Still, I will check the rest of your answers if I see something I can learn from, if you understand my drift. Thank you again mate! – Kalle H. Väravas Jul 27 '12 at 01:00
  • @KalleH.Väravas No worries his is better anyway since it doesn't require an extra step to parse the parameters like mine does since his regex matches in any order. Glad to help though. – drew010 Jul 27 '12 at 01:02
2

I would suggest you to use regex

\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]

Edit:

$string = 'this is some code and it has bbcode in it like [image width="740" height="249" parameters=""]51lca7dn56.jpg[/image] for example and [image parameters="" height="123" width="456"]12345.jpg[/image].';

if (preg_match_all('/\[image\b(?=(?:[^\]]*\bwidth="(\d+)"|))(?=(?:[^\]]*\bheight="(\d+)"|))(?=(?:[^\]]*\bparameters="([^"]+)"|))[^\]]*\]([^\[]*)\[\/image\]/i', $string, $match) > 0) {
    var_dump($match);
}

Output:

array(5) {
  [0]=>
  array(2) {
    [0]=>
    string(68) "[image width="740" height="249" parameters=""]51lca7dn56.jpg[/image]"
    [1]=>
    string(63) "[image parameters="" height="123" width="456"]12345.jpg[/image]"
  }
  [1]=>
  array(2) {
    [0]=>
    string(3) "740"
    [1]=>
    string(3) "456"
  }
  [2]=>
  array(2) {
    [0]=>
    string(3) "249"
    [1]=>
    string(3) "123"
  }
  [3]=>
  array(2) {
    [0]=>
    string(0) ""
    [1]=>
    string(0) ""
  }
  [4]=>
  array(2) {
    [0]=>
    string(14) "51lca7dn56.jpg"
    [1]=>
    string(9) "12345.jpg"
  }
}
Ωmega
  • 42,614
  • 34
  • 134
  • 203
  • 1
    Looks good my friend. Added test code w/ output to your post. – drew010 Jul 27 '12 at 00:45
  • @KalleH.Väravas - replace `\d+` with `[^"]*` - see http://ideone.com/A081J - answer has been updated – Ωmega Jul 27 '12 at 00:54
  • 1
    @Ωmega, yes perfect! Tested with all possible variations, works like charm! I have no idea how a human can compose regexes so to me, you guys are geniuses. Much thanks to both of you!! :) – Kalle H. Väravas Jul 27 '12 at 00:57
  • @Ωmega, sorry for one more question, but you might know the answer, while I'm going to spend hours to get to there. Is there a possibility to send `0` for width or height, if those parameters are not matched? Since `$cache->image_tag("51lca7dn56.jpg", , , "")` generates a syntax error. And I cannot use `($1 ? $1 : 0)` inside **preg_replace()** either :( – Kalle H. Väravas Jul 27 '12 at 01:14
  • 1
    @KalleH.Väravas - See http://ideone.com/6g3xc - modify there `regex_replacement` function - inside this function you can modify all 4 variables and then return replacement string you want... – Ωmega Jul 27 '12 at 01:33
  • @Ωmega, thanks for that nifty code. While you were typing it, I adopted drews idea of using preg_replace_callback. I posted my working solution to my answer aswell. Though I wonder, which method is faster, using preg_replace_callback or your regex_replacement? – Kalle H. Väravas Jul 27 '12 at 01:40
  • 1
    @KalleH.Väravas - Do you own testings and your own decision which one fits your needs... Good luck! – Ωmega Jul 27 '12 at 01:42