74

I am writing some code to interface with redmine and I need to upload some files as part of the process, but I am not sure how to do a POST request from python containing a binary file.

I am trying to mimic the commands here:

curl --data-binary "@image.png" -H "Content-Type: application/octet-stream" -X POST -u login:password http://redmine/uploads.xml

In python (below), but it does not seem to work. I am not sure if the problem is somehow related to encoding the file or if something is wrong with the headers.

import urllib2, os

FilePath = "C:\somefolder\somefile.7z"
FileData = open(FilePath, "rb")
length = os.path.getsize(FilePath)

password_manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
password_manager.add_password(None, 'http://redmine/', 'admin', 'admin')
auth_handler = urllib2.HTTPBasicAuthHandler(password_manager)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
request = urllib2.Request( r'http://redmine/uploads.xml', FileData)
request.add_header('Content-Length', '%d' % length)
request.add_header('Content-Type', 'application/octet-stream')
try:
    response = urllib2.urlopen( request)
    print response.read()
except urllib2.HTTPError as e:
    error_message = e.read()
    print error_message

I have access to the server and it looks like a encoding error:

...
invalid byte sequence in UTF-8
Line: 1
Position: 624
Last 80 unconsumed characters:
7z¼¯'ÅÐз2^Ôøë4g¸R<süðí6kĤª¶!»=}jcdjSPúá-º#»ÄAtD»H7Ê!æ½]j):

(further down)

Started POST "/uploads.xml" for 192.168.0.117 at 2013-01-16 09:57:49 -0800
Processing by AttachmentsController#upload as XML
WARNING: Can't verify CSRF token authenticity
  Current user: anonymous
Filter chain halted as :authorize_global rendered or redirected
Completed 401 Unauthorized in 13ms (ActiveRecord: 3.1ms)
Mac
  • 3,397
  • 3
  • 33
  • 58

4 Answers4

98

Basically what you do is correct. Looking at redmine docs you linked to, it seems that suffix after the dot in the url denotes type of posted data (.json for JSON, .xml for XML), which agrees with the response you get - Processing by AttachmentsController#upload as XML. I guess maybe there's a bug in docs and to post binary data you should try using http://redmine/uploads url instead of http://redmine/uploads.xml.

Btw, I highly recommend very good and very popular Requests library for http in Python. It's much better than what's in the standard lib (urllib2). It supports authentication as well but I skipped it for brevity here.

import requests
with open('./x.png', 'rb') as f:
    data = f.read()
res = requests.post(url='http://httpbin.org/post',
                    data=data,
                    headers={'Content-Type': 'application/octet-stream'})

# let's check if what we sent is what we intended to send...
import json
import base64

assert base64.b64decode(res.json()['data'][len('data:application/octet-stream;base64,'):]) == data

UPDATE

To find out why this works with Requests but not with urllib2 we have to examine the difference in what's being sent. To see this I'm sending traffic to http proxy (Fiddler) running on port 8888:

Using Requests

import requests

data = 'test data'
res = requests.post(url='http://localhost:8888',
                    data=data,
                    headers={'Content-Type': 'application/octet-stream'})

we see

POST http://localhost:8888/ HTTP/1.1
Host: localhost:8888
Content-Length: 9
Content-Type: application/octet-stream
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/1.0.4 CPython/2.7.3 Windows/Vista

test data

and using urllib2

import urllib2

data = 'test data'    
req = urllib2.Request('http://localhost:8888', data)
req.add_header('Content-Length', '%d' % len(data))
req.add_header('Content-Type', 'application/octet-stream')
res = urllib2.urlopen(req)

we get

POST http://localhost:8888/ HTTP/1.1
Accept-Encoding: identity
Content-Length: 9
Host: localhost:8888
Content-Type: application/octet-stream
Connection: close
User-Agent: Python-urllib/2.7

test data

I don't see any differences which would warrant different behavior you observe. Having said that it's not uncommon for http servers to inspect User-Agent header and vary behavior based on its value. Try to change headers sent by Requests one by one making them the same as those being sent by urllib2 and see when it stops working.

nightshiba
  • 37
  • 5
Piotr Dobrogost
  • 41,292
  • 40
  • 236
  • 366
  • No idea why, but using the requests module the exact same code works fine... Thanks a lot. Although, now I am very curious to know why urllib does not work... – Mac Jan 22 '13 at 11:55
  • With requests have a look here: https://stackoverflow.com/questions/12385179/how-to-send-a-multipart-form-data-with-requests-in-python – lorenzo May 18 '19 at 12:36
3

This has nothing to do with a malformed upload. The HTTP error clearly specifies 401 unauthorized, and tells you the CSRF token is invalid. Try sending a valid CSRF token with the upload.

More about csrf tokens here:

What is a CSRF token ? What is its importance and how does it work?

Community
  • 1
  • 1
Josh Liptzin
  • 746
  • 5
  • 12
2

you need to add Content-Disposition header, smth like this (although I used mod-python here, but principle should be the same):

request.headers_out['Content-Disposition'] = 'attachment; filename=%s' % myfname
LetMeSOThat4U
  • 6,470
  • 10
  • 53
  • 93
  • curl does not need that, why python does? – Mac Jan 16 '13 at 22:00
  • I think curl is doing it silently, although I wouldn't bet the farm on this - your practical option is to use Wireshark and simply see what's running on the wire between curl and server (it's not easy to use wireshark on localhost though, you'd have to have separate machine for that). – LetMeSOThat4U Jan 17 '13 at 10:55
  • oops, apparently curl uses urlencoded format, at least it did on my small file, http://pastie.org/5704526 . That's another option I didn't think of. – LetMeSOThat4U Jan 17 '13 at 13:04
  • I only found above using wireshark (what did you use?) I urge you to do the same bc all other tracing tools interpret (read: distort) what's really being sent and received. In my case I owe you correction - apparently the tool I used created multipart MIME message for the contents of POST: http://pastie.org/5703946 . That's where Content-Disposition belongs apparently. – LetMeSOThat4U Jan 17 '13 at 13:05
  • P.S. I used command like `curl --data-binary "@users.csv" -b cookie.txt -X POST http://myhost/site.py`, and wireshark says it's HTTP/POST so I think that curl did use POST yet it used urlencoded file with packet contents like on first pastie I linked in comment above. – LetMeSOThat4U Jan 17 '13 at 13:10
  • curl has a trace switch that dumps the data in the pastie "--trace-ascii " – Mac Jan 17 '13 at 13:12
  • PS: As I wrote in the other reply I tried using FileData = open(FilePath, "rb").read().encode("base64") but it does not work either – Mac Jan 17 '13 at 13:13
  • *you need to add Content-Disposition header* He doesn't need to. The question is about sending binary data in the body of the request and not uploading files. – Piotr Dobrogost Jan 21 '13 at 14:22
  • Reread the phrase *I am trying to mimic the commands here:* and see what `curl --data-binary "@image.png"` does. – Piotr Dobrogost Jan 21 '13 at 23:06
  • @PiotrDobrogost: The first sentence reads: "I am writing some code to interface with redmine and I need to upload some files as part of the process..". What curl does or doesn't do is irrelevant. – LetMeSOThat4U Jan 22 '13 at 17:41
  • Well, you are wrong. The phrase *to upload a file* is a shortcut but it's a misleading shortcut as it can refer to different things in context of HTTP. Hopefully, there is curl command shown and it's what defines the meaning of *file upload* in this question. – Piotr Dobrogost Jan 22 '13 at 20:10
  • @PiotrDobrogost: uploading the file is the *goal* of the OP and not any kind of "shortcut" which is your excuse for not understanding the question and incorrectly focusing on peculiar behavior of curl in this context. Were it curl/urlencoded question, it would even be titled "emulating curl" or smth like that instead of "Python POST binary data". I don't think it matters to OP or anybody reasonable whether they upload files to Redmine curl-mimicking way or any other way as long as the goal is achieved. The OP's goal is not "misleading" in any way. – LetMeSOThat4U Jan 23 '13 at 10:15
  • I'll try to explain this one last time. The term *upload a file* in the context of HTTP in most cases means to do the same what browsers do when you have an option of uploading a file. In such a case browsers send a file using *multipart/form-data* encoding and they usually send a file's name too - see [Send file using POST from a Python script](http://stackoverflow.com/q/68477/95735). It's clear from the curl example given in the question that *multipart/form-data* encoding is not used so we do not talk about *file upload* in the most common meaning. – Piotr Dobrogost Jan 23 '13 at 20:35
-2

You can use unirest, It provides easy method to post request. `

import unirest
 
def callback(response):
 print "code:"+ str(response.code)
 print "******************"
 print "headers:"+ str(response.headers)
 print "******************"
 print "body:"+ str(response.body)
 print "******************"
 print "raw_body:"+ str(response.raw_body)
 
# consume async post request
def consumePOSTRequestASync():
 params = {'test1':'param1','test2':'param2'}
 
 # we need to pass a dummy variable which is open method
 # actually unirest does not provide variable to shift between
 # application-x-www-form-urlencoded and
 # multipart/form-data
  
 params['dummy'] = open('dummy.txt', 'r')
 url = 'http://httpbin.org/post'
 headers = {"Accept": "application/json"}
 # call get service with headers and params
 unirest.post(url, headers = headers,params = params, callback = callback)
 
 
# post async request multipart/form-data
consumePOSTRequestASync()
kgkmeekg
  • 524
  • 2
  • 8
  • 17
gvir
  • 256
  • 3
  • 4