98

I am trying to GET a URL using Python and the response is JSON. However, when I run

import urllib2
response = urllib2.urlopen('https://api.instagram.com/v1/tags/pizza/media/XXXXXX')
html=response.read()
print html

The html is of type str and I am expecting a JSON. Is there any way I can capture the response as JSON or a python dictionary instead of a str.

Deepak B
  • 2,245
  • 5
  • 26
  • 28
  • 1
    Is `response.read()` returning a valid JSON string? – Martijn Pieters Dec 17 '12 at 20:46
  • Yes its a valid JSON string its just or type str and not dict – Deepak B Dec 17 '12 at 20:47
  • If it's a JSON representation of a string, rather than a JSON representation of an object (dict), you can't force the server to return you different data; you probably need to make a different request. If it's just that you don't know how to parse a JSON representation into the equivalent Python object, Martjin Pieters' answer is correct. – abarnert Dec 17 '12 at 20:50

10 Answers10

191

If the URL is returning valid JSON-encoded data, use the json library to decode that:

import urllib2
import json

response = urllib2.urlopen('https://api.instagram.com/v1/tags/pizza/media/XXXXXX')
data = json.load(response)   
print data
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 1
    @ManuelSchneid3r: The answer here is for Python 2, where reading from `response` gives you bytestrings, and `json.load()` expects to read a bytestring. JSON *must* be encoded using a UTF codec, and the above works for UTF-8, UTF-16 and UTF-32, provided a BOM codepoint is included for the latter two codecs. The answer you link to presumes UTF-8 was used, which is *usually* correct because that's the default. As of Python 3.6, the `json` library auto-decodes bytecodes with JSON data provided a UTF encoding is used. – Martijn Pieters Jun 05 '18 at 09:41
  • @ManuelSchneid3r: I'd otherwise recommend you use the `requests` library, which also automatically detects the correct UTF codec to use in cases where the BOM is missing and no characterset was specified in the response header. Just use the `response.json()` method. – Martijn Pieters Jun 05 '18 at 09:42
43
import json
import urllib

url = 'http://example.com/file.json'
r = urllib.request.urlopen(url)
data = json.loads(r.read().decode(r.info().get_param('charset') or 'utf-8'))
print(data)

urllib, for Python 3.4
HTTPMessage, returned by r.info()

Azat Ibrakov
  • 9,998
  • 9
  • 38
  • 50
SanalBathery
  • 628
  • 5
  • 10
5
"""
Return JSON to webpage
Adding to wonderful answer by @Sanal
For Django 3.4
Adding a working url that returns a json (Source: http://www.jsontest.com/#echo)
"""

import json
import urllib

url = 'http://echo.jsontest.com/insert-key-here/insert-value-here/key/value'
respons = urllib.request.urlopen(url)
data = json.loads(respons.read().decode(respons.info().get_param('charset') or 'utf-8'))
return HttpResponse(json.dumps(data), content_type="application/json")
Raccoon
  • 400
  • 4
  • 13
  • In case of Django 1.7 + , you could use JsonResponse directly as follows `from django.http import JsonResponse return JsonResponse({'key':'value'})` – Raccoon Feb 23 '18 at 13:34
  • 1
    I was doing json.dump() instead of json.dumps(), feeling dumb, Thanks for the save! – Hashir Baig Jul 06 '18 at 09:16
4

Be careful about the validation and etc, but the straight solution is this:

import json
the_dict = json.load(response)
MostafaR
  • 3,547
  • 1
  • 17
  • 24
2
resource_url = 'http://localhost:8080/service/'
response = json.loads(urllib2.urlopen(resource_url).read())
Jossef Harush Kadouri
  • 32,361
  • 10
  • 130
  • 129
1

Python 3 standard library one-liner:

load(urlopen(url))

# imports (place these above the code before running it)
from json import load
from urllib.request import urlopen
url = 'https://jsonplaceholder.typicode.com/todos/1'
Adam
  • 2,873
  • 3
  • 18
  • 17
1

you can also get json by using requests as below:

import requests

r = requests.get('http://yoursite.com/your-json-pfile.json')
json_response = r.json()
Haritsinh Gohil
  • 5,818
  • 48
  • 50
0

Though I guess it has already answered I would like to add my little bit in this

import json
import urllib2
class Website(object):
    def __init__(self,name):
        self.name = name 
    def dump(self):
     self.data= urllib2.urlopen(self.name)
     return self.data

    def convJSON(self):
         data=  json.load(self.dump())
     print data

domain = Website("https://example.com")
domain.convJSON()

Note : object passed to json.load() should support .read() , therefore urllib2.urlopen(self.name).read() would not work . Doamin passed should be provided with protocol in this case http

0

This is another simpler solution to your question

pd.read_json(data)

where data is the str output from the following code

response = urlopen("https://data.nasa.gov/resource/y77d-th95.json")
json_data = response.read().decode('utf-8', 'replace')
Himanshu Aggarwal
  • 163
  • 1
  • 1
  • 5
-1

None of the provided examples on here worked for me. They were either for Python 2 (uurllib2) or those for Python 3 return the error "ImportError: No module named request". I google the error message and it apparently requires me to install a the module - which is obviously unacceptable for such a simple task.

This code worked for me:

import json,urllib
data = urllib.urlopen("https://api.github.com/users?since=0").read()
d = json.loads(data)
print (d)
Uxbridge
  • 429
  • 4
  • 7
  • 2
    You are evidently using Python 2. In Python 3, there is no `urllib.urlopen`; `urlopen` is in the `urllib.request` module. – Nick Matteo Mar 13 '17 at 19:47