19

I have some JSON data like:

{
  "status": "200",
  "msg": "",
  "data": {
    "time": "1515580011",
    "video_info": [
      {
          "announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
          "announcement_shop": "",

etc.

How do I grab the content "FOLLOW ME PLEASE"? I tried using

replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']

But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.

How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
aquatic7
  • 615
  • 1
  • 6
  • 19
  • I reopened the question after some fixes; it is not a proper duplicate, since it specifically deals with the question of interpreting nested JSON data *as a string that needs to be re-parsed*. Most of the questions that were originally closed as a duplicate of this, should be duplicates of https://stackoverflow.com/questions/12788217/how-to-extract-a-single-value-from-json-response instead; I will edit them. – Karl Knechtel Jul 02 '22 at 01:18

3 Answers3

39

In a single line -

>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'

To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.

First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.

{
    'data': {
        'time': '1515580011',
        'video_info': [{
            'announcement': (    # ***
            """{
                "announcement_id": "6",
                "name": "INS\\u8d26\\u53f7",
                "icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
                "icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
                "videoid": "15154610218328614178",
                "content": "FOLLOW ME PLEASE",
                "x_coordinate": "0.22",
                "y_coordinate": "0.23"
            }"""),
            'announcement_shop': ''
        }]
    },
    'msg': '',
    'status': '200'
} 

*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.

First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.

So, in summary, "descend" the ladder that is "data" using the following "rungs" -

  1. data, a dictionary
  2. video_info, a list of dicts
  3. announcement, a dict in the first dict of the list of dicts
  4. content residing as part of json data.

First,

i = data['data']

Next,

j = i['video_info']

Next,

k = j[0] # since this is a list

If you only want the first element, this suffices. Otherwise, you'd need to iterate:

for k in j:
    ...

Next,

l = k['announcement']

Now, l is JSON data. Load it -

import json
m = json.loads(l)

Lastly,

content = m['content']

print(content)
'FOLLOW ME PLEASE'

This should hopefully serve as a guide should you have future queries of this nature.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
cs95
  • 379,657
  • 97
  • 704
  • 746
  • 2
    How can we handle null cases here efficiently . Suppose if no m['content'] is present in JSON – Sebastian Apr 18 '18 at 09:23
  • 1
    @JibinMathew I'd imagine something along the lines of a try-except AttributeError or if block should be more than enough. – cs95 Apr 20 '18 at 15:57
  • I created this method for helper. https://stackoverflow.com/questions/16129652/accessing-json-elements/66275899#66275899 – Elinaldo Monteiro Sep 18 '21 at 21:28
3

You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.

You'll have to decode that string first:

import json

replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])

then handle the resulting dictionary from there.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
-2

The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358