Getting Page Content via Json

Question

Link:http://creepypasta.wikia.com/api.php?%20action=query&prop=revisions&titles=Main_Page&rvprop=content&indexpageids=1&format=jsonfm

From the json file above I want to get the value of "*". I am using python and have the request setup. Normally if I didn't need to grab the page id before I could get the page content I could do this. But seeing as it is not I have run into a bit of trouble and need a bit of help.

If I understand you well, it is not really related to MediaWiki. It's rather "how to get some subtree of JSON". If so, please remove confusing MediaWiki tags. — skalee, Dec 24 '13 at 02:14

score 0 · Accepted Answer · edited May 23 '17 at 12:23

That page isn't actually json - it is a representation of the json in html. To request the json, remove the 'fm' at the end of the url.

In this code, I will load the json into a dictionary using the urllib2 and json packages, and then access the * item.

url = "http://creepypasta.wikia.com/api.php?%20action=query&prop=revisions&titles=Main_Page&rvprop=content&indexpageids=1&format=json"
j = json.load(urllib2.urlopen(url))
value = j['query']['pages']['22491']['revisions'][0]['*']

If you do not know what page number to look at, consider the method found here (replicated below):

def _finditem(obj, key):
    if key in obj: return obj[key]
    for k, v in obj.items():
        if isinstance(v,dict):
            item = _finditem(v, key)
            if item is not None:
                return item

_finditem(j,'revisions')[0]['*']

I added the fm to give users here a structured json view. My issues is that I do not know the id of the page that I am trying to get the content of. Just the name of it. — cataclysmicpinkiepie, Dec 24 '13 at 02:06
I've updated my answer to account for that. Please let me know if this helps. — nfazzio, Dec 24 '13 at 02:13

Getting Page Content via Json

1 Answers1