I have a program that looks like this:
import json
import requests
article_name = "BT Centre"
article_api_url = "https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles={}".format(article_name)
called_data = requests.get(article_api_url)
formatted_data = called_data.json()
print(formatted_data)
pages = formatted_data["query"]["pages"]
print(pages)
first_page = pages[0]["extract"]
print(first_page)
For the first print statement, where it prints the whole JSON, it returns this:
{
'batchcomplete': '',
'query':{
'pages':{
'18107207':{
'pageid': 18107207,
'ns': 0,
'title':'BT Centre',
'extract': "The BT Centre is the global headquarters and registered office of BT Group..."
}
}
}
}
When I try to access the "extract" data with the first_page
variable, it returns:
Traceback (most recent call last):
File "wiki_json_printer.py", line 15, in <module>
first_page = pages[0]["extract"]
KeyError: 0
The problem is, I can't set first_page
to pages["18107207"]["extract"]
because the Page ID changes for every article.
Edit: Solution from Ann Zen works:
You can use a for loop to loop through the keys of the pages dictionary, and detect which one is the ID via the str.isdigit() method:
for key in pages: if key.isdigit(): print(pages[key]["extract"])