3

I have a JSON String which is:

{"dependencies":["xz","pkg-config","glib","gobject-introspection"],"conflicts_with":[],"caveats":null,"options":[{"option":"--universal","description":"Build a universal binary"}]}

And I wrote a regular expression to find the array behind "dependencies":

(?<=\"dependencies\":).*[^:](?=,)

in Java:

"(?<=\\\"dependencies\\\":).*[^:](?=,)"

However the result turns out:

["xz","pkg-config","glib","gobject-introspection"],"conflicts_with":[],"caveats":null,"options":[{"option":"--universal"

And only the last colon was excluded.

Please help me out.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
spacegoing
  • 5,056
  • 5
  • 25
  • 42
  • Did you try `"(?<=\\\"dependencies\\\":).*?[^:](?=,)"` ? – Wiktor Stribiżew Apr 08 '15 at 12:00
  • @stribizhew yes and it turns out " ["xz" ", It seems like [^:} do not take any effectToT – spacegoing Apr 08 '15 at 12:03
  • And what if you use `(?<=\\\"dependencies\\\":).*?(?=(?<=\\]),)` ? – Wiktor Stribiżew Apr 08 '15 at 12:06
  • `(?<=\"dependencies\":)[^\]]*\]` – Alex Salauyou Apr 08 '15 at 12:08
  • The question is " [^:] " doesn't exclude " ] ". Even that expression returns the "right" result but is not able to be generalized. I want to find the value behind a given key. Not every value is an array so...:P – spacegoing Apr 08 '15 at 12:10
  • Please post the expected result. If you want to exclude `[` and `]`, you can use `(?<=\\\"dependencies\\\":\\[).*?(?=\\],)` regex. – Wiktor Stribiżew Apr 08 '15 at 12:11
  • @SashaSalauyou yes this can also get the right result. But could u please explain why the negation on : doesn't work in my expression? – spacegoing Apr 08 '15 at 12:13
  • @stribizhev The thing is I want to get the value for any given key. The regex I came up with is to find the longest sequences ended with , but is excluded with :. This is what [^:](?=,) supposed to mean. But why the negation on : doesn't work? Many thanks! – spacegoing Apr 08 '15 at 12:18
  • @ChangLi because `.*` part takes as much characters as possible with greedy search turned on by default. `.*[^:]` means "any number of any symbols where not a colon is at the end", for example `abc:bca` satisfies it. – Alex Salauyou Apr 08 '15 at 12:18
  • 1
    I guess you wanted `[^:]*` to match any character but `:` any number of times. `.*` is greedy. – Wiktor Stribiżew Apr 08 '15 at 12:20

3 Answers3

2

I suggest using this regex:

(?<=\"dependencies\":).[^:]*(?=,)

Or, almost equal:

(?<=\"dependencies\":)[^:]+(?=,)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • However, I think the `.` is unnecessary, it is already matched by `[^:]*`. `(?<=\"dependencies\":)[^:]+(?=,)` is enough. `+` will require at least 1 character in the match. – Wiktor Stribiżew Apr 08 '15 at 12:24
1

You could use a non-greedy zero-or-more quantifier:

(?<=\"dependencies\":)\[(.*?)\]

This would match ["xz","pkg-config","glib","gobject-introspection"] in the provided JSON.

Marchev
  • 1,340
  • 1
  • 10
  • 11
0

This brought to mind parsing HTML with a regular expression. Have you tried using a JSON Parser instead?

String jsonString = "{\"dependencies\":[\"xz\",\"pkg-config\",\"glib\",\"gobject-introspection\"],\"conflicts_with\":[],\"caveats\":null,\"options\":[{\"option\":\"--universal\",\"description\":\"Build a universal binary\"}]}";
JsonReader jsonReader = Json.createReader(new StringReader( jsonString ));
JsonObject obj = jsonReader.readObject();
jsonReader.close();
String dependencies = obj.getJsonArray( "dependencies" ).toString();
Community
  • 1
  • 1
MT0
  • 143,790
  • 11
  • 59
  • 117