0

I'm calling the google API myself instead of using their python Library because I'm behind an inconvenient corporate proxy which kills their library, so I have to do it all myself.

This works fine:

requests.get('https://www.googleapis.com/webmasters/v3/sites', params = 
{'access_token':'my_access_token_here'})

This on the other hand, doesn't:

site = https://www.my_website_from_the_above_function.com
site = urllib.parse.quote_plus(site)


def get_website_info():
    url = 'https://www.googleapis.com/webmasters/v3/sites/{}/searchAnalytics/query'.format(site)
    params = { 
    "endDate": "2017-12-10",
    "startDate": "2017-12-01",
    "access_token": my_access_token
    }

    r = requests.post(url, params = params)

    return r


x = get_website_info().json()

All I get is this error code:

{'error': {'code': 500,
  'errors': [{'domain': 'global',
    'message': 'Backend Error',
    'reason': 'backendError'}],
  'message': 'Backend Error'}}

Even with the reccomended 'Exponential backoff'
Using googles API explorer seems to work fine:

Google screenshot

Aditionally: This also seems give similar errors:

r = requests.post(url, params = auth_params, data = json.dumps(params))

and finally:

r = requests.post(url, params = auth_params, data = params)

just gives

{'error': {'code': 400,
  'errors': [{'domain': 'global',
    'message': 'This API does not support parsing form-encoded input.',
    'reason': 'parseError'}],
  'message': 'This API does not support parsing form-encoded input.'}}
2yan
  • 307
  • 2
  • 12
  • For your `500` error... no idea, really... For the second one, it looks like (maybe) you have to have to specify that the `Content-Type` request's header is `Application/Json`? Check this other question: https://stackoverflow.com/q/9733638/289011 – Savir Dec 28 '17 at 19:00
  • 1
    Got it you set me on the right track! I had to pass in, `headers = {'Content-type': 'application/json', 'Authorization' : 'Bearer %s' % access_token}` and json.dumps(data) – 2yan Dec 28 '17 at 19:12

2 Answers2

1

So, you can think of the contents of a request as just text, right? Not only text, but text that accepts a relatively limited number of characters.

With that in mind, it all boils down on how to serialize "complex" data structures into text. I recently answered another question about files that is kinddddaaa similar idea.

If you have a bunch of key=value parameters, you could use a simple "trick":

  1. Control names and values are escaped. Space characters are replaced by +, and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by %HH, a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., %0D%0A).
  2. The control names/values are listed in the order they appear in the document. The name is separated from the value by = and name/value pairs are separated from each other by &.

So this data:

{a="foo", b="bar baz"}

Could be serialized into text following the specification above like: a=foo&b=bar+baz

That serialization format is identified as application/x-www-form-urlencoded in the Content-type request's header. That request's header is telling the server that receives it something like "Hey! The data that is coming in my body is serialized following that convention that separates keys from values using the = symbol and splits key/value pairs using &, changes whitespaces by +... and so on"

(!) Very important: That is the format used by the requests module on a POST unless told otherwise.

Another format, which allows more flexibility (such as maintaining basic types or nesting structures) is JSON. That is the format that the Google server "wants", and in order to tell servers that the "text" contained in the request's body follows the Json standard (or convention), the Content-Type header must be set to 'application/json'.

What appears that your Google server was doing upon receiving a request was checking the Content-type header and if it wasn't Json, it gave you a 400 error to indicate "Oh, I don't understand this format... I want Json!"

That's why you have to specify the Json header.

There's an example comparing both formats here.

You can also see it more clearly since the latest versions of requests module can do the JSON parsing for you. Since the JSON format has become so common, you can pass data provided in a Python structure (a dict, for instance) through the json= argument, and the module will do the json.dumps and set the header for you. This also allows you to "introspect" a little how the body will look like (to see the differences maybe more clearly).

Check this out:

from requests import Request

data = {
    'a': 'foo-1 baz',
    'b': 5,
    'c': [1, 2, 3],
    'd': '6'
}

req = Request('POST', 'http://foo.bar', data=data)
prepped = req.prepare()
print("Normal headers: %s" % prepped.headers)
print("Normal body: %s" % prepped.body)

req = Request('POST', 'http://foo.bar', json=data)
prepped = req.prepare()
print("Json headers: %s" % prepped.headers)
print("Json body: %s" % prepped.body)

Outputs:

Normal headers: {'Content-Length': '31', 'Content-Type': 'application/x-www-form-urlencoded'}
Normal body: d=6&a=foo-1+baz&c=1&c=2&c=3&b=5
Json headers: {'Content-Length': '52', 'Content-Type': 'application/json'}
Json body: b'{"d": "6", "a": "foo-1 baz", "c": [1, 2, 3], "b": 5}'

See the difference? JSON is capable of making a difference between the strings foo-1 or 6 (using ") as opposed to 5 being an integer, while the x-www-form can't (see how the form encoding doesn't differentiate between the integer 5 or the string 6). Same with the list. By using the character [, the server will be able to tell that c is a list (and of integers)

Savir
  • 17,568
  • 15
  • 82
  • 136
0

I got it! The solution:

was to pass in header information with:

headers = {'Content-type': 'application/json',
           'Authorization' : 'Bearer %s' % access_token}

and make sure the json data was dumped to a string:

r = requests.post(url,data = json.dumps(params),  headers = headers)

If someone could explain the reason behind my answer, that would be great.

2yan
  • 307
  • 2
  • 12
  • 1
    You can think of the contents of your request as "text", right? There's no much magic that can be done: it is just text. Then the `Content-type` header tells the server *how* the text is formatted. By default, the `requests` module tells the server that is formatted as a `form`, and it (the server) doesn't like that: it wants JSON, but the header says it's formatted as a `form`. The server doesn't look any further and rejects it. There's a bit more info here: https://www.smtpeter.com/en/documentation/json-vs-post – Savir Dec 28 '17 at 19:38
  • That makes a lot of sense, if you were to write up an answer, I'd be more than happy to mark that as the correct answer. – 2yan Jan 02 '18 at 16:26
  • Well, that is very kind of you. I did it **:-)** I hope the explanation is clear. Dunno, maybe not (maybe it's too long, and beating around the bush) If you have more questions about this, add a comment so it pops out in my S.O. *inbox* and I'll try to answer (to the extent of my possibilities, which are not... are not that many) **:-/** – Savir Jan 02 '18 at 19:28