Extract Json data from RestFul webservice using Python

Question

We use Bugsplatsoftware.com to collect all the crashes. They have a RESTFull web service which returns JSON data. I would like to get data for individual crashes. The data is behind a login/password...

I tried the following but the results are not as expected.

import requests
from requests.auth import HTTPBasicAuth

args={'id':11111,'data':0}

response=requests.get("https://www.bugsplatsoftware.com/individualCrash",params=args,auth=HTTPBasicAuth("username","password"))

data=response.json()

response.headers returns following

response.headers
{'content-length': '10549', 'connection': 'close', 'server': 'Apache/2.4.12 (Win32) OpenSSL/1.0.1l PHP/5.5.21', 'x-frame-options': 'SAMEORIGIN', 'x-pingback': 'https://www.bugsplatsoftware.com/xmlrpc.php', 'expires': 'Thu, 19 Nov 1981 08:52:00 GMT', 'cache-control': 'no-store, no-cache, must-revalidate, post-check=0, pre-check=0', 'x-xss-protection': '1; mode=block', 'date': 'Wed, 22 Apr 2015 17:43:37 GMT', 'content-encoding': 'gzip', 'link': '<https://www.bugsplatsoftware.com/?p=363>; rel=shortlink', 'vary': 'Accept-Encoding,User-Agent', 'x-content-type-options': 'nosniff', 'x-powered-by': 'PHP/5.5.21', 'content-type': 'text/html; charset=UTF-8', 'pragma': 'no-cache'}

What do I need to do to get the json data? Thanks in advance.

When I print response.url it shows https://www.bugsplatsoftware.com/login/ instead of https://www.bugsplatsoftware.com/individualCrash/?id=11111&data=0....

marmeladze, "bugsplatsoftware.com/individualCrash?id=11111&data=0"; returns json data (at least in the browser) and this is what I need.

pygeek, when I call response.content it seems like the data is html page.....

Ivan, how do I specify the "content-type" to requests.get?

Seems like I need to do something like Using Python Requests: Sessions, Cookies, and POST I tried the following

import requests
s=requests.Session()
data={"login":'tester',"password":'testing'}
url="https://wwww.bugsplatsoftware.com/login"
r=s.post(url,data=data)

and I get unauthorized error message

or if I simply do

s.get(url) I get too many redirects

Have you tried changing the content-type from text/html to text/json? — Ivan, Apr 22 '15 at 18:05

score 0 · Answer 1 · answered Apr 22 '15 at 19:56

This is pretty straight forward actually. JSON is close in structure to Python Lists and Dictionaries. So all you need to do is convert the JSON string you receive back from the web service call into the appropriate sequence type, then you can use list comprehension on it to extract anything you want.

Here is some sample code I created to call a simple web service of mine.

import urllib, json, collections    

def getURLasString(url):
    s = urllib.urlopen(url).read()
    return s

def convertJSONStringToSequence(source):
    j = json.JSONDecoder(object_pairs_hook=collections.OrderedDict).decode(source)
    return j

def getURLasJSONSequence(url):
    s = getURLasString(url)
    return convertJSONStringToSequence(s)

url = "http://127.0.0.1:8081/lookupnames" # my local web service
s = getURLasString(url)
print s
j = getURLasJSONSequence(url)
print j

sampleJSON = [{"LastName": "DaVinci", "FirstName": "Leonardo"}, {"LastName": "Newton", "FirstName": "Isaac"}]

filteredList = [e["FirstName"] for e in sampleJSON if e["LastName"].startswith("D")]
print filteredList

Much of this is just functions I created to make it easier, but the main parts are this; you need to import json package. Then the JSONDecoder will convert the string you get back from the web service call to one of the native Python sequences (typically a list of dictionaries). I wrote a bunch of helper functions to either convert a string to a sequence, or just call a URL directly and return a sequence.

The program I wrote would give the following output:

[{"LastName": "DaVinci", "FirstName": "Leonardo"}]
[OrderedDict([(u'LastName', u'DaVinci'), (u'FirstName', u'Leonardo')])]
['Leonardo']

Hello Joseph, Thanks for your comments. I believe my problem is the request.get is not returning json data and I think it is because I am probably not passing the login/password correctly....When I access the URL from the browser everything works as expected, but I am logged in to the site..... — ybi, Apr 22 '15 at 20:36
It seems like the request.get is returning the source code of the landing page for https://bugsplatsoftware.com — ybi, Apr 22 '15 at 20:37
When I print response.url it shows https://www.bugsplatsoftware.com/login/ instead of https://www.bugsplatsoftware.com/individualCrash/?id=11111&data=0....So looks like I am unable to unlock the site.... — ybi, Apr 22 '15 at 20:51

Extract Json data from RestFul webservice using Python

1 Answers1