0

Banging my head on a problem. I will caveat in advance that is not reproducible since I cannot share my end point. Also I work as a data scientist, so my knowledge of web technologies is limited.

from urllib.request import Request, urlopen

url = "https://www.some_endpoint.com/"
req = Request(
    url, headers={"API-TOKEN": "some_token"})
json_string = "{"object": "XYZ".....}"

response = urlopen(req, json_string.encode("utf-8"))

I am getting unusual behavior on the urlopen. When my JSON is below 65536 bytes, as shown by evaluating len(json_string.encode('utf-8')), this urlopen call works fine. When it is over that limit, I get an HTTP 500 error.

Is this purely a server side error limitation on sizing? What is unusual is that when the large data is passed through a GUI to the endpoint, it works fine. Or is there something I can do to chunk my data to sub 64k bytes on the urlopen? Are there industry standards for handling this?

AZhao
  • 13,617
  • 7
  • 31
  • 54

1 Answers1

1

An HTTP 500 error indicates an "internal server error". In theory, this means that there is not a problem with your code, there is a problem with the server.

In practice, an HTTP 500 error can mean almost anything, because many servers will use HTTP 500 as the default error code when a more specific error code is not provided by the programmer. Unfortunately, this means you are reduced to making guesses at how somebody else's code works.

Here are some possible approaches:

  • It's possible that the server has a maximum request size of 64 KiB. You can reduce your request size by using more compact JSON (remove spaces between delimiters) or by using Content-Encoding: gzip.

    import gzip
    import json
    
    # Remove whitespace from JSON
    json_string = json.dumps(
        json.loads(json_string),
        separators=(',', ':'))
    # Encode as Gzip
    json_data = gzip.compress(
        json_string.encode('UTF-8'))
    
    req = Request(
        url, headers={"API-TOKEN": "some_token",
                      "Content-Encoding": "gzip"})
    response = urlopen(req, json_data)
    
  • It's possible that there is some way of splitting or chunking the request into multiple, smaller requests. This would require knowledge of the exact API you are using.

  • It's possible that there's some bug in the server or a proxy somewhere in the chain that prevents you from sending the request as written. You could try using Transfer-Encoding: chunked, if Content-Length does not work for >64 KiB. It's possible the server expects to use 100 Continue, but urllib does not support this.

If you MITM your GUI client with a tools like Charles, you can see the exact format of the request and you can make your own request use the same format.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • thanks for the answer. tried gzip, transfer-encoding; no luck. i will just split on my end locally. – AZhao Oct 24 '18 at 19:59