Parse http GET and POST parameters from BaseHTTPHandler?

Question

BaseHTTPHandler from the BaseHTTPServer module doesn't seem to provide any convenient way to access http request parameters. What is the best way to parse the GET parameters from the path, and the POST parameters from the request body?

Right now, I'm using this for GET:

def do_GET(self):
    parsed_path = urlparse.urlparse(self.path)
    try:
        params = dict([p.split('=') for p in parsed_path[4].split('&')])
    except:
        params = {}

This works for most cases, but I'd like something more robust that handles encodings and cases like empty parameters properly. Ideally, I'd like something small and standalone, rather than a full web framework.

score 88 · Answer 1 · edited Oct 31 '18 at 19:02

88

You may want to use urllib.parse:

>>> from urllib.parse import urlparse, parse_qs
>>> url = 'http://example.com/?foo=bar&one=1'
>>> parse_qs(urlparse(url).query)
{'foo': ['bar'], 'one': ['1']}

For Python 2, the module is named urlparse instead of url.parse.

edited Oct 31 '18 at 19:02

alexdlaird

1,174
12
34

answered Aug 23 '11 at 22:04

zag

3,379
1
21
19

2

It should be noted that urlparse in Python 2 doesn't handle encodings, the Python 3 version does support that. Additionally, to maintain the correct order `parse_qsl` should be used instead of `parse_qs` which returns a list. – Wolph Jul 01 '16 at 08:35
1

The module name is `urllib`, not `url`, in Python 3.6 – Evan Jul 17 '18 at 18:06

Mike · Answer 2 · 2022-11-23T14:43:38.680

17

Better solution to an old question (updated):

Python 3:

def do_POST(self):
  length = int(self.headers.get('content-length'))
  field_data = self.rfile.read(length)
  fields = parse.parse_qs(str(field_data,"UTF-8"))

Working example: public gist

Python 2.x:

def do_POST(self):
    length = int(self.headers.getheader('content-length'))
    field_data = self.rfile.read(length)
    fields = urlparse.parse_qs(field_data)

This will pull urlencoded POST data from the document content and parse it a dict with proper urldecoding

edited Nov 23 '22 at 14:43

answered Jul 12 '15 at 03:03

Mike

2,429
1
27
30

I was trying to make the most basic server that could handle get and post requests in Python, and only yours worked for me in handling POST requests. This was written 3 years ago, but thanks! :) – harkirat1892 Aug 28 '15 at 12:41
2

This will not work if the POST request uses `Transfer-Encoding: chunked`. see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding – Balz Guenat Mar 15 '18 at 16:56
@BalzGuenat BaseHTTPServer implements an HTTP 1.0 server so as I understand it, chunked encodings are not allowed – Mike Apr 22 '18 at 14:10
In Python 3, this will cause an `AttributeError: 'HTTPMessage' object has no attribute 'getheader'`. To fix, replace the `getheader()` call with `get()`. – Henrik Heimbuerger Nov 23 '22 at 11:30

Wolph · Accepted Answer · 2016-07-01T08:50:42.437

You could try the Werkzeug modules, the base Werkzeug library isn't too large and if needed you can simply extract this bit of code and you're done.

The url_decode method returns a MultiDict and has encoding support :)

As opposed to the urlparse.parse_qs method the Werkzeug version takes care of:

encoding
multiple values
sort order

If you have no need for these (or in the case of encoding, use Python 3) than feel free to use the built-in solutions.

score 2 · Answer 4 · answered Mar 22 '10 at 05:46

2

Have you investigated using libraries like CherryPy? They provide a much quicker path to handling these things than BaseHTTPServer.

answered Mar 22 '10 at 05:46

Benno

5,640
2
26
31

score 1 · Answer 5 · edited Jun 20 '20 at 09:12

Basic HTTP request parameters support is provided in the CGI module. The recommended mechanism to handle form data is the cgi.FieldStorage class.

To get at submitted form data, it’s best to use the FieldStorage class. The other classes defined in this module are provided mostly for backward compatibility. Instantiate it exactly once, without arguments. This reads the form contents from standard input or the environment (depending on the value of various environment variables set according to the CGI standard). Since it may consume standard input, it should be instantiated only once.

The FieldStorage instance can be indexed like a Python dictionary. It allows membership testing with the in operator, and also supports the standard dictionary method keys() and the built-in function len(). Form fields containing empty strings are ignored and do not appear in the dictionary; to keep such values, provide a true value for the optional keep_blank_values keyword parameter when creating the FieldStorage instance.

For instance, the following code (which assumes that the Content-Type header and blank line have already been printed) checks that the fields name and addr are both set to a non-empty string:

form = cgi.FieldStorage()
if "name" not in form or "addr" not in form:
    print "<H1>Error</H1>"
    print "Please fill in the name and addr fields."
    return
print "<p>name:", form["name"].value
print "<p>addr:", form["addr"].value
#...further form processing here...

The CGI library doesn't handle encodings (like utf-8) for you so it's less suited than some of the other libraries available. — Wolph, Mar 22 '10 at 06:50
Encoding can be delegated to the file-like 1st argument of FieldStorage. — gimel, Mar 22 '10 at 07:11
True, but why bother when there are scripts that handle this for you including the catching of errors? No need to reinvent the wheel. — Wolph, Mar 22 '10 at 14:05

Parse http GET and POST parameters from BaseHTTPHandler?

5 Answers5

Linked