0

Hi I've been trying to create a Regex in Java to match JSON data that I have converted to a string using json_encode. I've been reading through examples on stackoverflow but I'm not sure do they just relate to pure JSON or strings containing JSON representation. I've tried applying a few but can't get them to work.

Looking here:

Regex to validate JSON

And here:

Regex to match a JSON String

This is the string I am trying to use my Regex to match to:

[{"resourceUri":"file:\/home\/admin\/test-modeling\/apache-tomcat-7.0.70\/temp\/IOS\/filetest-file-files.txt#\/\/@statements.12\/@typeList.0\/@enumLiterals.11","severity":"WARNING","lineNumber":333,"column":9,"offset":7780,"length":24,"message":"Enum name should be less than 20 characters"}]

I've tried using this answer and it matches fine when I use regex101 for testing.

https://stackoverflow.com/a/6249375/5476612

I'm using this regex from here:

/\A("([^"\\]*|\\["\\bfnrt\/]|\\u[0-9a-f]{4})*"|-?(?=[1-9]|0(?!\d))\d+(\.\d+)?([eE][+-]?\d+)?|true|false|null|\[(?:(?1)(?:,(?1))*)?\s*\]|\{(?:\s*"([^"\\]*|\\["\\bfnrt\/]|\\u[0-9a-f]{4})*"\s*:(?1)(?:,\s*"([^"\\]*|\\["\\bfnrt\/]|\\u[0-9a-f]{4})*"\s*:(?1))*)?\s*\})\Z/is

However when I try and use it as a string in Java I get escaped character issues.

Can anyone help me with fixing the regex to work as a String to use in Java or help me create one that will work?

EDIT 1: Here's the full String I am looking at that I am trying to match the JSON string above against:

../../tool/model/toolingValidationReport.php?fileName=test-testing-types.txt&fileSize=18380&validationReport=[{"resourceUri":"file:\/home\/admin\/test-modeling\/apache-tomcat-7.0.70\/temp\/IOS\/filetest-file-files.txt#\/\/@statements.12\/@typeList.0\/@enumLiterals.11","severity":"WARNING","lineNumber":333,"column":9,"offset":7780,"length":24,"message":"Enum name should be less than 20 characters"}] target=

EDIT 2: Here's the Java I am using to perform the regex check. The href variable contains the String content shown in edit 1.

Pattern validationReportPattern = Pattern.compile(getValidationReportPattern());
Matcher validationReportMatcher = validationReportPattern.matcher(href);

public String getYangValidationReportPattern(){
   return "(\\[\\{.*\\}])"; 
}

String validationReport  = validationReportMatcher.group(1);
Karol Dowbecki
  • 43,645
  • 9
  • 78
  • 111
olliejjc16
  • 361
  • 5
  • 20
  • What are you trying to do , are you trying to see if an attribute in the JSON is present ? Trying to parse this into a JSON Object ? What are you trying to match ? – Ramachandran.A.G Jun 30 '17 at 09:29
  • 2
    I'm unsure what exactly you want to match. For example with `\[{.*}]` you match everything inside `[{` and `}]`, see in [regex101](https://regex101.com/r/aKwHhK/1). In java you would have to use: `\\[{.*}]` to escape the backslash. – Andre Kampling Jun 30 '17 at 09:32
  • Hi Andre, thank you for the answer. I'm trying to match to the json string I listed above, I'll update the question to make that clearer. Your answer again works fine in regex101 but when I try and use it in Java and escape it it's still not matching the string. This is the regex I'm now trying to use in Java: `"(\\[\\{.*\\}])"` – olliejjc16 Jun 30 '17 at 10:16
  • @olliejjc16 I don't think you want to escape `{` or `}`. To use you regex in Java just escape every `\ ` and `"` from your original regex. – kalsowerus Jun 30 '17 at 10:31
  • @kalsowerus I had to use the escapes for them as I would get an illegal repetition error if I didn't – olliejjc16 Jun 30 '17 at 10:33
  • It works fine for me (with the escaped `{}`). Are you sure your JSON does not contain any leading or trailing whitespace characters? – kalsowerus Jun 30 '17 at 10:43
  • Ya I can't figure it out, I've trimmed the string so that should prevent the whitespace characters issue right? Its really annoying because it matches fine on online regex matchers but I keep getting no match found error – olliejjc16 Jun 30 '17 at 11:08
  • [`trim()`](https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#trim()) just removes leading and trailing whitespaces. Maybe there is whitespace between `[` and `{` or `]` and `}`. Why not just print the string out? – Andre Kampling Jun 30 '17 at 11:46
  • @AndreKampling hi printed out the entire string I'm searching through like you said its in my main post – olliejjc16 Jun 30 '17 at 12:17
  • Post the calls in your code where you trying to match... – Andre Kampling Jun 30 '17 at 12:19
  • @AndreKampling Added in edit 2 – olliejjc16 Jun 30 '17 at 12:27
  • If you copy the regex from regex101 and paste it in IntelliJ / Android studio, it should automagically escape the necessary characters and work. – RobCo Jun 30 '17 at 12:29
  • Looks like an XY Problem to me. Regex is not a good tool for consuming JSON. https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem – slim Jun 30 '17 at 13:47
  • @slim: Generally I would agree when talking about parsing that string. But he just want to extract the Json string. After this he can parse it as you said. – Andre Kampling Jul 01 '17 at 07:12
  • It's a URL path, so the smart thing to do is to parse it as a URL and take the relevant query parameter. Again, URL libraries are easy to find. – slim Jul 01 '17 at 08:57

1 Answers1

1

The regex pattern must be "(\\[\\{.*}])" in Java but the real problem is that no match has been attempted. You have to call find() before calling group().

If you do that it works, see here online at ideone.

Output:

Match: [{"resourceUri":"file:\/home\/admin\/test-modeling\/apache-tomcat-7.0.70\/temp\/IOS\/filetest-file-files.txt#\/\/@statements.12\/@typeList.0\/@enumLiterals.11","severity":"WARNING","lineNumber":333,"column":9,"offset":7780,"length":24,"message":"Enum name should be less than 20 characters"}]

The find() method returns a boolean so that you can check if there is an occurrence. If you would not check that with find() first you would get an java.lang.IllegalStateException: No match found exception.

if (validationReportMatcher.find())
{
   String validationReport  = validationReportMatcher.group(1);
   System.out.println ("Match: " + validationReport);
}
else
{
   System.out.println ("No match");
}

If you have to search for multiple matches you call find() in a while loop:

while (validationReportMatcher.find())
{
   String validationReport  = validationReportMatcher.group(1);
   System.out.println ("Match: " + validationReport);
}

But this seems not to be neccessary as you just look for one occurrence.

Andre Kampling
  • 5,476
  • 2
  • 20
  • 47