1

I have string regex
"product_default_shipping_cost:\['(.*)'"

and string

"product_sale_price:['19.99'], product_default_shipping_cost:['1.99'], product_type:['Newegg']"

and i want get only 1.99. My code :

Pattern pattern = Pattern.compile(regex_string);
Matcher m = pattern.matcher(html);

while (m.find()) {
   for (int i = 1; i <= groupCount; i++) {
      Log.w(TAG,m.group(i));
   }
}

But i have got 1.99'], product_type:['Newegg'] Strange that it regular expression works perfectly in python and SWIFT but not java. I can not change this regular. What could be the issue and how to fix it?

P.S i really can't change this regular, it takes dynamic

Tany
  • 393
  • 1
  • 4
  • 16
  • 3
    What do you mean you can't change the regex? – Tunaki Oct 23 '15 at 18:54
  • regex string takes dynamic – Tany Oct 23 '15 at 19:00
  • "*Strange that it regular expression works perfectly in python and SWIFT*" are you sure? Based on this test -> [(click)](http://pythex.org/?regex=product_default_shipping_cost%3A%5C%5B%27(.*)%27&test_string=product_sale_price%3A%5B%2719.99%27%5D%2C%20product_default_shipping_cost%3A%5B%271.99%27%5D%2C%20product_type%3A%5B%27Newegg%27%5D&ignorecase=0&multiline=0&dotall=0&verbose=0) doesn't seem to be working as you described. Anyway if you can't change your regex you can't fix your problem because `.*` by default is greedy. – Pshemo Oct 23 '15 at 19:00
  • I'd consider writing a trivial parser instead of using regex. – Dave Newton Oct 27 '15 at 16:39

3 Answers3

2

Try changing it to this:

product_default_shipping_cost:\['(.*?)'

.*? is lazy and will only try to match up to the first '

d0nut
  • 2,835
  • 1
  • 18
  • 23
  • I can't change, this string gets dynamic – Tany Oct 23 '15 at 18:55
  • 1
    Then what do expect us to do for you? The regex is wrong. Your best bet is try to fake it 'till you make it and split the string on `,` – d0nut Oct 23 '15 at 19:00
2

.* will match as many characters as possible (is "greedy").

You can either use non-gready .*?, or limit what can be matched: [^']*.

Jiri Tousek
  • 12,211
  • 5
  • 29
  • 43
1

You're using a greedy regex .*, try using a non greedy, also know as lazy, by appending a ?, i.e.:

product_default_shipping_cost:\['(.*?)'\]

Pattern pattern = Pattern.compile("product_default_shipping_cost:\['(.*?)'\]");
Matcher m = pattern.matcher(html);

while (m.find()) {
   for (int i = 1; i <= groupCount; i++) {
      Log.w(TAG,m.group(i));
   }
}

DEMO

https://regex101.com/r/aF1hI6/1


Nice explanation about greedy and lazy regex

Community
  • 1
  • 1
Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268