2

Having trouble with performing a POST instead of a GET in Python Urllib. Im running 3.5. Im trying to POST to form field.

I read that urllib.request.Request will default to POST if the data parameter is present. I read this at https://docs.python.org/3/howto/urllib2.html

I duplicate these settings and when I fire up wireshark all I see is GETs and Never a Post even though it looks like the code is executing.

Here is my code:

values = {"field1" : z[2:-1], "Submit":"Save"}
print(values)
data = urllib.parse.urlencode(values)
data = data.encode('utf-8')
print(data)
req = urllib.request.Request("http://www.randomsite.com/myprocessingscript.php", data)
with urllib.request.urlopen(req) as response:
    the_page = response.read()
print(the_page)

When I fireup wireshark this is what results from the req line:

GET /myprocessingscript.php HTTP/1.1 Accept-Encoding: identity Host: ec2-52-91-45-113.compute-1.amazonaws.com Connection: close User-Agent: Python-urllib/3.5

HTTP/1.1 200 OK Date: Wed, 28 Oct 2015 02:47:22 GMT Server: Apache/2.4.17 (Unix) OpenSSL/1.0.1p PHP/5.5.30 mod_perl/2.0.8-dev Perl/v5.16.3 X-Powered-By: PHP/5.5.30 Content-Length: 23 Connection: close Content-Type: text/html

no post data to process

ADDITIONALLY When I run the script, this is what i get from the print statements:

{'Submit': 'Save', 'field1': 'hostlab\chris'} b'Submit=Save&field1=hostlab%5Cchris%5Cr%5Cn' b'no post data to process' Traceback (most recent call last): File "C:\Users\chris\Desktop\test.py", line 20, in time.sleep(random.randint(5,10))

There are two web files they are accessing. Index.html and myprocessingscript.php:

Index.html:

<h1>randomsite.com.</h1>

####<p>whoami</p>

<form action="myprocessingscript.php" method="POST">
    <input name="field1" type="text" />
    <input type="submit" name="submit" value="Save">
</form>

</body>
</html>

myprocessingscript.php:

<?php if(isset($_POST['field1'])) {
    $data = $_POST['field1'] . "\n";
    $ret = file_put_contents('/tmp/mydata.txt', $data);
    if($ret === false) {
        die('There was an error writing this file');
    }
    else {
        echo "$ret bytes written to file";
    }
}
else {
   die('no post data to process');
}
ChrisMan
  • 119
  • 1
  • 2
  • 10

2 Answers2

6

HTTP POST works as expected:

#!/usr/bin/env python
from contextlib import closing
try:
    from urllib.parse import urlencode
    from urllib.request import urlopen
except ImportError: # Python 2
    from urllib import urlencode
    from urllib2 import urlopen

url = 'http://httpbin.org/post'
data = urlencode({"field1" : "value", "Submit": "Save"}).encode()
with closing(urlopen(url, data)) as response:
    print(response.read().decode())

You may see GET only after an http redirect (as the rfc recommends -- no data should be posted on redirect without prompting the user).

For example, here's an http server that redirects POST / requests:

#!/usr/bin/env python
from flask import Flask, redirect, request, url_for # $ pip install flask

app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'POST':
        return redirect(url_for('post'))
    return '<form method="POST"><input type="submit">'


@app.route('/post', methods=['GET', 'POST'])
def post():
    return 'Hello redirected %s!' % request.method

if __name__ == '__main__':
    import sys
    port = int(sys.argv[1]) if len(sys.argv) > 1 else None
    app.run(host='localhost', port=port)

Making an HTTP POST request using the same code (urlopen(url, data)) leads to the redirection and the second request is GET:

"POST / HTTP/1.1" 302 -
"GET /post HTTP/1.1" 200 -

Again, the first request is POST, not GET. The behavior is exactly the same if you visit / and click submit button (the browser makes POST request and then GET request).

Python issue: "Document how to forward POST data on redirects" contains a link to HTTPRedirectHandler's subclass that posts data on redirect:

#!/usr/bin/env python
from contextlib import closing
try:
    from urllib.parse import urlencode
    from urllib.request import (HTTPError, HTTPRedirectHandler, Request,
                                build_opener, urlopen)
except ImportError: # Python 2
    from urllib import urlencode
    from urllib2 import (HTTPError, HTTPRedirectHandler, Request,
                         build_opener, urlopen)

class PostHTTPRedirectHandler(HTTPRedirectHandler):
    """Post data on redirect unlike urrlib2.HTTPRedirectHandler."""
    def redirect_request(self, req, fp, code, msg, headers, newurl):
        m = req.get_method()
        if (code in (301, 302, 303, 307) and m in ("GET", "HEAD")
            or code in (301, 302, 303) and m == "POST"):
            newurl = newurl.replace(' ', '%20')
            CONTENT_HEADERS = ("content-length", "content-type")
            newheaders = dict((k, v) for k, v in req.headers.items()
                              if k.lower() not in CONTENT_HEADERS)
            return Request(newurl,
                           data=req.data,
                           headers=newheaders,
                           origin_req_host=req.origin_req_host,
                           unverifiable=True)
        else:
            raise HTTPError(req.get_full_url(), code, msg, headers, fp)


urlopen = build_opener(PostHTTPRedirectHandler).open

url = 'http://localhost:5000'
data = urlencode({"field1" : "value", "Submit": "Save"}).encode()
with closing(urlopen(url, data)) as response:
    print(response.read().decode())

The access log shows two POST requests in this case (the second request is POST):

"POST / HTTP/1.1" 302 -
"POST /post HTTP/1.1" 200 -

Note: you could customize the HTTPRedirectHandler to follow the rfc 2616 behavior.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

OK so i figured out what was wrong. The python module "requests.post" will not perform a post if the url is one that redirects. So I had to put the actual url in for it to work and not a url that would direct me to my desired url.

THis is the same for those using urllib

ChrisMan
  • 119
  • 1
  • 2
  • 10