0

I have to process some json file in order to flatten it. I am facing challenge in capturing some text pattern using regex, as general flatten libraries won't work on my file. I have to capture "," from within [{"sometext","more_text"}] text in order to replace it with \",\". I tried a lot but cant capture "," my current regex that captures entire bracket text is

\[\{".*?\}\]

Sample json

{"action":"add","actions":[{"country","IND"}],"ctx":"{\"charset\":\"UTF-8\",\"client_id\":\"G9.2.2045341156.1629864410\",\"cookie\":\"1\",\"domain_user\":\"28f51979-bef8-4731-b242-e3a00496d9ba\",\"fp\":\"659810303\",\"id\":\"d368f505-6226-4650-aeba-caf9fb18fc53\",\"page_res\":\"478x4952\",\"refr\"

Example: https://regex101.com/r/vcjPQx/1

Regex should be able to select only one occurance of "," that comes inside square bracket and curly braces.

palamuGuy
  • 156
  • 9
  • 1
    Does this answer your question? [How to flatten a nested JSON recursively, with flatten\_json](https://stackoverflow.com/questions/58442723/how-to-flatten-a-nested-json-recursively-with-flatten-json) – anotherGatsby Mar 27 '22 at 08:28
  • ... both files are NOT valid json, they are concatenation of JSON strings ... there is even multiple JSON per line ... – azro Mar 27 '22 at 08:29
  • 1
    So I won't help because I don't want to deal with such bad content, but don't use regex : use json.loads and json.dumps to pass from python structure to string please – azro Mar 27 '22 at 08:32
  • @anotherGatsby unfortunately my file is not in a format expected by flatten_json library – palamuGuy Mar 27 '22 at 09:31

2 Answers2

1

You can use pattern

(\[\{[^\{\[\]\}]*)(\",\")([^\[\]\{\}]*\}\])

which contains 3 capturing groups. We copy the 1st and 3rd as found and use the pattern.

\1\\\",\\\"\3

which replaces each "," with \",\" .
We use \1 and \3 to copy the 1st and 3rd capturing groups.

https://regex101.com/r/aNHZ04/2

  • This will not work as I don't want to replace all character (\",\") but only those falling within square and curly braces represented by regex \[\{".*?\}\]. – palamuGuy Mar 27 '22 at 09:33
  • This looks amazing, can you please refer me some materials to learn this level of pattern matching. – palamuGuy Mar 27 '22 at 16:14
0

Hit, If understood your problem correctly, I think this regex is your solution:

/{\s*\"[^"]+\"\s*(,)\s*\"[^"]+\"\s*}/gm

See Test results: https://regex101.com/r/y4ocUT/1

Faizan AlHassan
  • 389
  • 4
  • 8