-1

I need to find a regex to assert the presence of certain keys in a JSON object.

Example, let's say I have a JSON object like this

{"key1": {...}, "key2": [...], "key3": "some id", "key4": "irrelevant"}

I need a regex that asserts that, for example, key1, key2 and key3 are there.

Note that in a JSON the order of elements is irrelevant.

I've been searching in the web, including here on stackoverflow, and the only solution that seemed it would solve my problem was this

^(?=.*\bkey1\b)(?=.*\bkey2\b)(?=.*\bkey3\b).*$

provided here, but it's not working for me. It doesn't match anything in my JSON object.

Does anyone know why? Is there a better solution?

Thanks

Marco Castanho
  • 395
  • 1
  • 7
  • 24

4 Answers4

0

Regex isn't a good choice for this kind of task but as you mentioned in your comments that you only need a regex solution, you need to correct your regex to be something like this,

^(?=.*"key1":)(?=.*"key2":)(?=.*"key3":).*$

Your lookaheads (?=.*\bkey1\b) will allow matching of key1 anywhere in the text including value due to which it won't enforce them as keys.

But as the keys are surrounded by doublequotes and followed by a colon, hence I've used (?=.*"key1":). Also if you think there can be space between " and : then modify the above regex to take care of optional spaces as well and write it as,

^(?=.*"key1"\s*:)(?=.*"key2"\s*:)(?=.*"key3"\s*:).*$

Check this demo

Pushpesh Kumar Rajwanshi
  • 18,127
  • 2
  • 19
  • 36
  • Thanks. But it's not working either. I suspect it might be something related with non-escaped special characters. I'm doing some experiments on your solution. – Marco Castanho Mar 28 '19 at 13:57
  • @MarcoCastanho: I've added a demo and there it is working as expected. Looks like there is some other issue. – Pushpesh Kumar Rajwanshi Mar 28 '19 at 13:58
  • Yes, I also tested your solution on a website, but as I told you, I suspect my software does (or expects) some kind of special characters escaping. – Marco Castanho Mar 28 '19 at 14:05
  • 1
    @MarcoCastanho: Although none of the characters needed escaping but if you doubt that escaping may help, then I suggest you to play with your tool to see what regex work as literals and what needs to be escaped. Go with the characters in my regex one by one. – Pushpesh Kumar Rajwanshi Mar 28 '19 at 14:08
0

You can just check if the keys exist in the dictionary or not:

import json

json_string = '{ "key1": "some id", "key2": "some id", "key3": "some id", "key4": "irrelevant" }'

# Deserialize the JSON string into a Python dictionary  
deserialized_dict = json.loads(json_string)

# Check if Key1, Key2, Key3 keys exist in the dictionary or not
if "key1" and "key2" and "key3" in deserialized_dict:
    print ("All keys are present")
else:
    print ("Keys are absent")
Kunal Mukherjee
  • 5,775
  • 3
  • 25
  • 53
  • The software is developed in python but I cannot add python code to it. It receives a message containing a JSON string. I have to write an expected message that the software will compare with the received one(s). The only way I can check if the keys exist is by adding regex on the expected message. – Marco Castanho Mar 28 '19 at 13:52
0

EDIT: Oh I see that you removed the Python Tag Now

.

EDIT:

This should grab them even if there are spaces in the keys

(\"[^,]+?\")[\s]*:

.

Still either way try these to see if they output the keys (but remember, these are for keys that don't contain spaces)

(\"[\S]+\")[\s]*:

.

(?:(?<=\")([\S]+?)\")[\s]*:

.

I think @"Kunal Mukherjee" has the best solution.

If you want to find the present keys without knowing the key names in advance though, this might help PROVIDED THAT YOU DON'T EXPECT THE KEYS TO HAVE SPACES

>>> import re

>>> string = '''{"key1": {...}, "key2": [...], "key3": "some id", "key4": "irrelevant"}'''



#OUTPUT
>>> re.findall('(\"[^,]+?\")[\s]*:', string)
['"key1"', '"key2"', '"key3"', '"key4"']



#OUTPUT
>>> re.findall('(\"[\S]+?\")[\s]*:', string)
['"key1"', '"key2"', '"key3"', '"key4"']



#OUTPUT
>>> re.findall('(?:(?<=\")([\S]+?)\")[\s]*:', string)
['key1', 'key2', 'key3', 'key4']
FailSafe
  • 482
  • 4
  • 12
0

I know that you don't want Python code, but I have included a working regex statement and used Python to test it. Assuming that you only care about whether or not key1, key2, and key3 are in your JSON, you could use the following pattern:

'"key1":.+"key2":.+"key3":.+'

The specifics will depend on your use case (for example, if you had {"my_key": {"key3": [...]}}, you might want to tweak the pattern depending on whether or not you consider a nested key to be valid). However, it works with the example you have given.

As Python code:

import re

pattern = re.compile(r'"key1":.+"key2":.+"key3":.+')

my_dict_str = r'{"key1": {...}, "key2": [...], "key3": "some id", "key4": "irrelevant"}'

print(pattern.search(my_dict_str))

Output

<re.Match object; span=(1, 71), match='"key1": {...}, "key2": [...], "key3": "some id", >
Jack Moody
  • 1,590
  • 3
  • 21
  • 38