0

I'm trying to scrape data from Verizon's buyback pricing site. I found the source of the information while going through "Net" requests in my browser. The site is in JSON format, but nothing I do will let me download that data https://www.verizonwireless.com/vzw/browse/tradein/ajax/deviceSearch.jsp?act=models&car=Verizon&man=Apple&siz=large

I can't remember everything I've tried, but here are the issues I'm having. Also, I'm not sure how to insert multiple code blocks.

import json,urllib,requests
res=urllib.request.urlopen(url)
data=json.loads(res)
TypeError: the JSON object must be str, not 'bytes'

import codecs
reader=codecs.getreader('utf-8')
obj=json.load(reader(res))
ValueError: Expecting value: line 1 column 1 (char 0)
#this value error happens with other similar attempts, such as....
res=requests.get(url)
res.json()#Same error Occurs

At this point I've researched many hours and can't find a solution. I'm assuming that the site is not formatted normally or I'm missing something obvious. I see the JSON requests/structure in my web developer tools.

Does anybody have any ideas or solutions for this? Please let me know if you have questions.

user3658033
  • 53
  • 1
  • 10
  • It's because Verizon is sending back an HTML page with JSON-looking stuff buried inside HTML tags, not a JSON-formatted string (which is what `.json()` takes as input). [This answer](http://stackoverflow.com/questions/13323976/how-to-extract-a-json-object-that-was-defined-in-a-html-page-javascript-block-us) should help you. – Akshat Mahajan Mar 31 '16 at 23:10

1 Answers1

0

You need to send a User-Agent HTTP header field. Try this program:

import requests

url='https://www.verizonwireless.com/vzw/browse/tradein/ajax/deviceSearch.jsp?act=models&car=Verizon&man=Apple&siz=large'
# Put your own contact info in next line
headers = {'User-agent':'MyBot/0.1 (+user@example.com)'}
r = requests.get(url, headers=headers)
print(r.json()['models'][0]['name'])

Result:

iPhone 6S
Robᵩ
  • 163,533
  • 20
  • 239
  • 308