0

There are many similar paths in my full JSON data result, but I want to pull out specific instances. I thought the best way to do that would be to simply indicate the index from which I want to pull the data.

Here's a snippet of my JSON:

"data":[
      {
         "id":"xxxx",
         "account_type":"None",
         "description":"Lorem Ipsum",
         "score":xx,
         "jumpers":xxxxx,
         "friends":xxx,
         "global":xxxxxxx,
         "hidden":true,
         "location":"xxxx, xx",
         "name":"xxxxx",

Here's what I ran:

url = 'my_url'
access_token = 'my_token'

result = requests.get(url,headers={'Content-Type':'application/json','Authorization': 'Bearer {}'.format(access_token)})
json = result.json()
company = json['data'][0,9]
print(company)

Where [0] is the first dataset beneath "data", and [9] for the "name" position. Obviously, this isn't the right way to do it, given the output TypeError: list indices must be integers or slices, not tuple

How do I access the first instance of 'name' by index? How does this process work for pulling other information by index?

Thanks in advance!

VRapport
  • 61
  • 1
  • 7

3 Answers3

2

In Python, json data usually is handled as a dictionary, so the easiest way to access information is by the dictionary key. Here is an example to access the "name" parameter in your json data

import json

data = """
{
  "data": [
    {
      "id": "xxxx",
      "account_type": "None",
      "description": "Lorem Ipsum",
      "engagement_score": "xx",
      "jumpers": "xxxxx",
      "friends": "xxx",
      "global": "xxxxxxx",
      "hidden": true,
      "location": "xxxx, xx",
      "name": "your_name"
    }
  ]
}

"""

json_data = json.loads(data)

print(json_data['data'][0]['name'])

Your code returned an error because you can't use a Tuple to index a list in Python, to index nested lists, just sequence the index, like this:

data = [
    [1, 2, 3],
    [4, 5, 6]
]

print(data[0][0])
Leonardo Sirino
  • 477
  • 2
  • 6
  • I appreciate this answer @Leonardo, and it certainly does the job. I failed to mention I was looking to do this at scale over hundreds of different JSON results, so bringing in the dataset for each scrape would be unfeasible. I did come up with another way (answer below) using your insight for direction. Thanks! – VRapport Oct 10 '21 at 15:38
0

In Python as of version 3.6, insertion order of keys in dictionaries is preserved. So it is also possible to access dictionary values based on insertion order (i.e. integer index). See the example below that shows accessing the value of attribute 'name' using its index of 3 which retrieves the correct value of 'xxxxx'.

Code:

import json

data_json='''[{"id":"xxxx","account_type":"None","attrib":true,"name":"xxxxx"}]'''

data_py = json.loads(data_json)
print("data_json",data_json)
print("data_py:",data_py)
print("name:",list(data_py[0].values())[3])

Output:

data_json [{"id":"xxxx","account_type":"None","attrib":true,"name":"xxxxx"}]
data_py: [{'id': 'xxxx', 'account_type': 'None', 'attrib': True, 'name': 'xxxxx'}]
name: xxxxx
grov
  • 58
  • 4
0

Without bringing in the JSON result for each of the URLs I want to scrape, this is what I ran:

url = 'my_url'
access_token = 'my_token'

result = requests.get(url,headers={'Content-Type':'application/json','Authorization': 'Bearer {}'.format(access_token)})
#Get JSON data   
response = result.json()
#Take the dictionary input and produce string output
json_dump = json.dumps(response)
#Take string input and produce dictionary output
dict_json = json.loads(json_dump)
print(dict_json['data'][0]['name'])

Output: 'Fido'

This solution may not be the most elegant one, (I plead ignorance) but it was taken from the guidance of the original answer and this one: https://stackoverflow.com/a/59980773/16880647

VRapport
  • 61
  • 1
  • 7