-1

I have a list of dictionaries that are coming out unquoted, such as below:

'[{reason: NA, employeeName: bob smith}, {reason: NA, employeName: tom jones}]'

I need to have this converted to a proper json format to look like this:

[{"reason": "NA", "employeeName:" "bob smith"}, {"reason": "NA", "employeName": "tom jones"}]

How can I accomplish this?

Tom Smith
  • 57
  • 5

3 Answers3

1

For the idea behind it, we took this list , and dealt with each dictionary alone , we turned it to a string so we can manipulate it. Next we're recreating the dictionary, by defining the key and the item foe each step: so as an example:

list_dict1 = '[{reason: NA, employeeName: bob smith}, {reason: NA, employeName: tom jones}]'
result=[]
# Converting string to list
list_dict2 = list_dict1.strip('][').split(', ')
# => it converts string representation of list to a list
for i in list_dict2:
# we loop on each inner "dictionary" rep
   s=str(i)# so we can use strip and split methods 
   d = dict([
    (x.split(':')[0].strip(), x.split(':')[1].strip("' "))
    for x in s.strip("{}").split(',')
   ])
   # for each x that represents key:item, cs we deleted {} and we split 
   them based on ","
   # Next we defined the key as being the first item , if we split using 
   # ":", and so on
   result.append(d)
   # we're appending each new dictionary created to our result list
result
Ran A
  • 746
  • 3
  • 7
  • 19
1

You should use regexp for that. On this site, you can test and learn how to use them.

So here is how I would solve your problem :

import json
import re

l = '[{reason: NA, employeeName: bob smith}, {reason: NA, employeName: tom jones, createdAt: 2021-04-28 17:04:52.684064+00:00}]'
res = []
objects = re.findall(r'{[^}]*}', l)
for o in objects:
    attr = re.findall(r'([^:,{]*):([^,}]*)', o)
    res.append({}) 
    for a in attr:
        res[-1][a[0].strip()] = a[1].strip()
print(json.dumps(res))
Dj0ulo
  • 440
  • 4
  • 9
  • So this works for the most part, except I have some fields that are a date timestamp: createdAt: 2021-04-28 17:04:52.684064+00:00 Turns into "createdAt": "2021", "27 14": "44", "": "08", "00": "00" Any workaround for that? @Dj0ulo – Tom Smith Apr 07 '22 at 13:38
  • No problem I edited the second regexp ! – Dj0ulo Apr 07 '22 at 14:30
-1

Here is my small Python Example with the json lib.

import json

data=[{"reason": "NA", "employeeName:" "bob smith"}, {"reason": "NA", "employeName": "tom jones"}]

json.dumps(data)
# output:
#'[{"reason": "NA", "employeeName:": "bob smith"}, {"reason": "NA", "employeName": "tom jones"}]'
Exciter
  • 94
  • 6