1

I have a String like this:

 [potato=carrot,test=12b,apple=peer,tree={oak=1,birch={value=3}},foo=bar]

and I would like, with Java regex, get a String array:

potato=carrot
test=12b
apple="peer"
tree={oak=1,birch={value=3}}
foo=bar

I try several pattern but nothing conclusive... That's it, so if anyone has an idea... :)

Thank you in advance!

Yeti
  • 1,108
  • 19
  • 28
Oromis
  • 196
  • 2
  • 13
  • This might be a little too broad, what have you tried already? – Yassin Hajaj May 12 '18 at 19:51
  • This isn't too broad. It's is a valid question which simply can't be done using the traditional String:split method and RegEx since there are nested brackets within the String. It can however be done with a small parser method. – DevilsHnd - 退職した May 12 '18 at 20:31
  • @DevilsHnd these question are borderline too broad. Regex is almost a programming language itself. If someone were to ask "Write a function [which does the above] without using regex". That would be considered too broad. The real issue I have with questions like these is it promotes poor coding choices. Using lengthy regex which the OP wouldn't understand will just cause headache for other developers working on the project in the future. Solutions like this generally have an existing library in place which will do it better than what ever solution is posted here. – johnny 5 May 13 '18 at 01:37

2 Answers2

1

Instead of inventing/parsing your own data-format, have you considered using JSON instead? It looks very similar to what you have above. Example.

If you decide to use JSON, then parsing a JSON input into structured data, can be easily done using one of many Java-Json libraries. Example.

If you have no control over the input, and absolutely have to parse the input in the format given above, here's one approach you can take, which is extremely cumbersome.

  1. Find all {...} blocks. Do this by iterating over the input, character-by-character, until you find an opening {. And when you do, continue iterating character-by-character until you find its matching }. Note that while doing this, you have to keep track of, and ignore, any nested {...} blocks.

  2. Once you've found a {...} block, replace it with 15 random alphanumeric characters. Eg: of9823ghownkd71

  3. For every randomly generated value above, use a HashMap to keep track of the {...} block it has replaced.

  4. Use a string.split(","), in order to convert the modified input into a string array

  5. Iterate through every entry in the string-array, and see if any of the keys in your HashMap (from step 3) are contained within this entry. If so, replace that key with the matching {...} block

The above algorithm is much more complex and error-prone. Use JSON inputs instead, if at all possible.

RvPr
  • 1,074
  • 1
  • 9
  • 26
1

You can try this regex

Raw: [^\[\],{}]+=(?:[^\[\],{}]+|(?=\{)(?:(?=.*?\{(?!.*?\1)(.*\}(?!.*\2).*))(?=.*?\}(?!.*?\2)(.*)).)+?.*?(?=\1)[^{]*(?=\2$))

Stringed: "[^\\[\\],{}]+=(?:[^\\[\\],{}]+|(?=\\{)(?:(?=.*?\\{(?!.*?\\1)(.*\\}(?!.*\\2).*))(?=.*?\\}(?!.*?\\2)(.*)).)+?.*?(?=\\1)[^{]*(?=\\2$))"

Demo: http://java-regex-tester.appspot.com/regex/14b6e038-b683-44cd-b46e-c161b0cd9001

Note that you can replace [^\[\],{}]+ with \w+ and you'd get the same result.

Readable version:

 [^\[\],{}]+ = 
 (?:
      [^\[\],{}]+ 
   |  
      (?= \{ )
      (?:
           (?=
                .*? \{
                (?! .*? \1 )
                (                             # (1 start)
                     .* \}
                     (?! .* \2 )
                     .* 
                )                             # (1 end)
           )
           (?=
                .*? \}
                (?! .*? \2 )
                ( .* )                        # (2)
           )
           . 
      )+?
      .*? 
      (?= \1 )
      [^{]* 
      (?= \2 $ )
 )