0

Can I check how do we convert the below to a dictionary?

code.py

message = event['Records'][0]['Sns']['Message']
print(message) 
# this gives the below and the type is <class 'str'>

 {
   "created_at":"Sat Jun 26 12:25:21 +0000 2021",
   "id":1408763311479345152,
   "text":"@test I\'m planning to buy the car today \ud83d\udd25\n\n",
   "language":"en",
   "author_details":{
      "author_id":1384883875822907397,
      "author_name":"\u1d04\u0280\u028f\u1d18\u1d1b\u1d0f\u1d04\u1d1c\u0299 x NFTs \ud83d\udc8e",
      "author_username":"cryptocurrency_x009",
      "author_profile_url":"https://xxxx.com",
      "author_created_at":"Wed Apr 21 14:57:11 +0000 2021"
   },
   "id_displayed":"1",
   "counter_emoji":{
      
   }
}

I would need to add in additional field called "status" : 1 such that it looks like this:

{
   "created_at":"Sat Jun 26 12:25:21 +0000 2021",
   "id":1408763311479345152,
   "text":"@test I\'m planning to buy the car today \ud83d\udd25\n\n",
   "language":"en",
   "author_details":{
      "author_id":1384883875822907397,
      "author_name":"\u1d04\u0280\u028f\u1d18\u1d1b\u1d0f\u1d04\u1d1c\u0299 x NFTs \ud83d\udc8e",
      "author_username":"cryptocurrency_x009",
      "author_profile_url":"https://xxxx.com",
      "author_created_at":"Wed Apr 21 14:57:11 +0000 2021"
   },
   "id_displayed":"1",
   "counter_emoji":{
      
   },
   "status": 1
}

Wanted to know what is the best way of doing this?

Update: I managed to do it for some reason.

I used ast.literal_eval(data) like below.

D2= ast.literal_eval(message)
D2["status"] =1
print(D2)
#This gives the below
    {
   "created_at":"Sat Jun 26 12:25:21 +0000 2021",
   "id":1408763311479345152,
   "text":"@test I\'m planning to buy the car today \ud83d\udd25\n\n",
   "language":"en",
   "author_details":{
      "author_id":1384883875822907397,
      "author_name":"\u1d04\u0280\u028f\u1d18\u1d1b\u1d0f\u1d04\u1d1c\u0299 x NFTs \ud83d\udc8e",
      "author_username":"cryptocurrency_x009",
      "author_profile_url":"https://xxxx.com",
      "author_created_at":"Wed Apr 21 14:57:11 +0000 2021"
   },
   "id_displayed":"1",
   "counter_emoji":{
      
   },
   "status": 1
}

Is there any better way to do this? Im not sure so wanted to check...

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
Adam
  • 1,157
  • 6
  • 19
  • 40
  • 1
    That already *is* a dictionary. Unless you mean you have a string that *looks* like that, in which case `json.loads` is the answer. Don't touch the `ast` module unless you *really* know your way around Python. How is this data coming into your program? – Silvio Mayolo Jun 26 '21 at 16:13
  • "I also tried using json.loads(data) but it raised an error" *What* error? – enzo Jun 26 '21 at 16:15
  • The contents of the `text` value seems to be invalid - how was this json-like created? – snakecharmerb Jun 26 '21 at 16:19
  • This was created using simple notification service in AWS (SNS) and sending this message to lambda. the message which is placed in SNS is a JSON object being converted to a JSON string (using JSON.dumps) . the data is being streamed from twitter. – Adam Jun 26 '21 at 16:35

2 Answers2

1

Can I check how do we convert the below to a dictionary?

As far as I can tell, the data = { } asigns a dictionary with content to the variable data.

I would need to add an additional field called "status" : 1 such that it looks like this

A simple update should do the trick.

data.update({"status": 1})
Seth
  • 96
  • 3
0

I found two issues when trying to deserialise the string as JSON

  • invalid escape I\\'m
  • unescaped newlines

These can worked around with

data = data.replace("\\'", "'")
data = re.sub('\n\n"', '\\\\n\\\\n"', data, re.MULTILINE)
d = json.loads(data)

There are also surrogate pairs in the data which may cause problems down the line. These can be fixed by doing

data = data.encode('utf-16', 'surrogatepass').decode('utf-16')

before calling json.loads.

Once the data has been deserialised to a dict you can insert the new key/value pair.

d['status'] = 1
snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
  • Hi @snakecharmerb thanks for this! actually can I check which is a better way? Doing this or using data= ast.literal_eval(data) ? – Adam Jun 27 '21 at 01:56
  • For me, using `ast.literal_eval` results in a `UnicodeEncodeError` because of the surrogate pairs (the emoji at the end of `text` and `author_name`), so I prefer my answer. You don't seem to get the error (perhaps you are executing the code on a Windows machine that uses UTF-16 natively?) so if `ast.literal_eval` works then that's as good as anything. In the end, both approaches are workarounds, what needs to be fixed is whatever upstream code that is generating invalid JSON. – snakecharmerb Jun 27 '21 at 07:47