0

I want to Crawl Website(google App store Reviews) using python 3.5 so i got the answer about using ajax ....

url = "https://play.google.com/store/getreviews?authuser=0"
param = {'reviewType': '0', 
         'pageNum': '1', 
         'id':'com.venticake.retrica',
         'reviewSortOrder':'4',
         'xhr':'1',
         'token':'ZLqR3TmB64y6koyq8uj1tqqiQ4k:14191636750027',
         'hl':'ko'}

r = requests.post(url, data=param)

d = json.loads(r.text) 

do like this

but response is

')]}\'\n\n[["ecr",1," \\u003cdiv class\\u003d\\"single-review\\" tabindex\\u003d\\"0\\"\\u003e   \\u003cspan\\u003e \\u003cspan

enter image description here

how do i make it to json or structured data ....

thank you for you guys time

Byungjun Lee
  • 135
  • 2
  • 7

1 Answers1

0

You can use json() method to convert python request's response.something like this.

url = "https://play.google.com/store/getreviews?authuser=0"

param = {'reviewType': '0', 
         'pageNum': '1', 
         'id':'com.venticake.retrica',
         'reviewSortOrder':'4',
         'xhr':'1',
      'token':'ZLqR3TmB64y6koyq8uj1tqqiQ4k:14191636750027',
         'hl':'ko'}

response = requests.post(url, data=param)
x = response.json()

This will serialize the response as json data.

Update:

I tested the script and found the following problems with the response.

  1. The response sent by server is not a json. example it has u")]}'\n\n" at the beginning. To verify it print response.text[:6]
  2. The response also contains some unicode characters which is unable to be encoded as ascii by response even if you specify 'charset': 'utf-8' in data.

I think these are the reasons you are unable to load the response as json data.

Mani
  • 5,401
  • 1
  • 30
  • 51
  • code : r.json() Traceback (most recent call last) return complexjson.loads(self.text, **kwargs) File "C:\Anaconda3\envs\py35\lib\json\__init__.py", line 319, in loads return _default_decoder.decode(s) File "C:\Anaconda3\envs\py35\lib\json\decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Anaconda3\envs\py35\lib\json\decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None JSONDecodeError: Expecting value i got this message – Byungjun Lee Apr 08 '17 at 08:09
  • Then try response.content and load using json.loads(response.content) response.content will be str. You can use loads to load the str response. – Mani Apr 08 '17 at 08:20
  • I can see some unicode characters in the response you have posted. I suspect it might be due to that also while loading the response using json.loads() – Mani Apr 08 '17 at 17:45
  • See this http://stackoverflow.com/q/43340424/6663095 for loading json from string containing Unicode characters. – Mani Apr 11 '17 at 17:09