2

I am trying to match the following two String objects with a regular expression. I am grouping and pulling out the values 3.00, 4.00, 100.00 from the first String; it works fine. However, I have a lot of String objects, and not all contain the third group, so I want to leave that group as optional. Thus I added a ? for that group to make it optional. Yet it doesn't make a difference, and any String without the third group portion is not matched by my regular expression.

How I can make the third group optional? And is there any advice as to whether or not my regular expression could be formatted better? Thank you!

Regular Expression

cost{"([0-9]+\.[0-9]+)[0-9"{]},+[a-zA-Z0-9]+{"+([0-9]+\.[0-9]+)[0-9A-Za-z"},{[\]\._-]+:price{"([0-9]+\.[0-9]+)?

String 1 which matches since it has third group values

cost{"3.00{},asdjdfhjkf23hawoutcome{"4.00"},79234gh3k2bdfsfgs2323g23jkg23{[]._-,bonus:price{"100.00"}jksdfjksdf222sdcfszfSDAWFD;

String 2 which doesn't match because it doesn't have third group values

cost{"5.00{},asdjdfhjkf23hawoutcome{"36.00"},79234gh3k2bdfsfgs2323g23jkg23{[]._-,jksdfjksdf222sdcfszfSDAWFD;
astrogeek14
  • 234
  • 1
  • 14
JasSy
  • 583
  • 5
  • 17
  • @WiktorStribiżew The String is exactly as these String varing in figures and may or may not have group 3. File size wise it is over 1TB if that matters. – JasSy Jun 22 '16 at 22:54
  • 1
    See https://regex101.com/r/cY3oH8/1. The POI is `(?::price{"([0-9]+\.[0-9]+))?`, this non-capturing group is totally optional. – Wiktor Stribiżew Jun 22 '16 at 22:55
  • What, exactly, is "group 3"? – Nic Jun 22 '16 at 22:55
  • @QPaysTaxes The value 100.00 in String 1 – JasSy Jun 22 '16 at 22:57
  • @JasSy Are you sure that's what you want to be optional? In your second string, that's not there, but neither is `price:`. – Nic Jun 22 '16 at 22:59
  • Also, [related and possibly duplicate](http://stackoverflow.com/q/22937618/1863564), though I'm not sure. – Nic Jun 22 '16 at 22:59

1 Answers1

0

You'll have to wrap all of group 3 in parentheses to make this work. I'm not sure exactly what you mean by it, but just going on what they contain, this is how your regex should look:

cost{"([0-9]+\.[0-9]+)[0-9"{]},+[a-zA-Z0-9]+{"+([0-9]+\.[0-9]+)[0-9A-Za-z"},{[\]\._-]+(:price{"([0-9]+\.[0-9]+))?

Notice the () around the last bit -- that's what makes the regex treat the whole thing as a single unit. Otherwise, it just looks at the last logical piece and sees the last (), not all of the part you want ignored.

Nic
  • 6,211
  • 10
  • 46
  • 69