1

I have a text where I would like to get the first occurrence of 2 or more strings in that text.

Text:

<prod##123456_test_12345##shirt> some more text <prod##123456_test_12345##shirt> 

regex:

<prod##(\d*)_(.*?)##(.*?)##(.*?)>

This will match the whole string.. But I would like to get "<prod##123456_test_12345##shirt>" only. (The first match).

I found this one:

(&lt;)(.*?\w+.*?)(&gt;)

It will match the first string, but I would like to keep my groups for parsing later on.

I've created a test here: http://regexr.com/v1?38pmq

I also tried Regular expression to stop at first match but I don't fully understand how it works..

(it's for PHP)

What I really want is to parse this list:

&lt;prod##12345678##Some text here&gt;

&lt;prod##12345678##Some text here##Extra text&gt;

&lt;prod##12345678##Some text here##Extra text&gt;

&lt;prod##12345678_TEEXT##Some text here&gt;

&lt;prod##12345678_TEEXT##Some text here##Extra text&gt;

&lt;prod##12345678_TEEXT##Some text here##Extra text&gt;

Is it possible to create one regex with groups for this list? 4 different ones would also be cool.

In PHP and output:

$product_reg = array ('/&lt;prod##(\d*)_(.*?)##(.*?)##(.*?)&gt;/',
                      '/&lt;prod##(\d*)_(.*?)##(.*?)&gt;/',
                      '/&lt;prod##(\d*)##(.*?)##(.*?)&gt;/',
                      '/&lt;prod##(\d*)##(.*?)&gt;/');
$product_rep = array ('<a href="domain.com/$1?test=$1&test2=$1_$2&$4">$3</a>',
                      '<a href="domain.com/$1?test=$1&test2=$1_$2">$3</a>',
                      '<a href="domain.com/$1?test=$3">$2</a>',
                      '<a href="domain.com/$1">$2</a>');
$string = preg_replace($product_reg, $product_rep, $string);
Community
  • 1
  • 1
Morten OC
  • 1,784
  • 2
  • 23
  • 30

2 Answers2

1

It looks like you have an extra (.*?)## to me. Try this:

&lt;prod##(\d*)_(.*?)##(.*?)&gt;

For the list of strings in your edit, you could do this:

&lt;prod##(\d*)(_(.*?))?##(.*?)&gt;

For example:

# Using the first string in your list:

preg_match("/&lt;prod##(\d*)(_(.*?))?##(.*?)&gt;/", "&lt;prod##12345678##Some text here&gt;", $matches);

var_dump($matches);

# array(5) {
#   [0] =>
#   string(38) "&lt;prod##12345678##Some text here&gt;"
#   [1] =>
#   string(8) "12345678"
#   [2] =>
#   string(0) ""
#   [3] =>
#   string(0) ""
#   [4] =>
#   string(14) "Some text here"
# }

And:

# Using the second string in your list:

preg_match("/&lt;prod##(\d*)(_(.*?))?##(.*?)&gt;/", "&lt;prod##12345678_TEEXT##Some text here##Extra text&gt;", $matches);

var_dump($matches);

# array(5) {
#   [0] =>
#   string(56) "&lt;prod##12345678_TEEXT##Some text here##Extra text&gt;"
#   [1] =>
#   string(8) "12345678"
#   [2] =>
#   string(6) "_TEEXT"
#   [3] =>
#   string(5) "TEEXT"
#   [4] =>
#   string(26) "Some text here##Extra text"
# }
Derek Kurth
  • 1,771
  • 1
  • 19
  • 19
  • Thanks! Doh I see that now! My question is actually a bit more complex after I see this one.. Will update my question in a sec. – Morten OC Oct 03 '14 at 11:47
1

You have a superfluoous group in your regex, try:

&lt;prod##(\d*)_(.*?)##(.*?)&gt;
Toto
  • 89,455
  • 62
  • 89
  • 125