Posting to CloudApp API (AWS) with Python Requests

Question

I've spent a few days trying to figure out how to post an image to CloudApp in Python, using Requests to access CloudApp's API. I'm able to accomplish this using pycloudapp, which uses Poster, but I'd like to learn how with Requests.

I've been trying to use InspectB.in to compare what is being posted by my script and by pycloudapp to try to find the differences. There don't seem to be many, but obviously the few that exist are important. With my current code, I'm getting a server-side error (500), which is frustrating. Because the Poster-based code works, I'm hoping to find a way to get Requests to work as well, though I suppose this might not be feasible.

CloudApp uses Amazon Web Storage, and I know the "file" parameter has to go last with AWS. So far I've tried several permutations of using data = collections.OrderedDict(sorted(upload_values)); data['file'] = open(last_pic, 'rb') without a files parameter, as opposed to using separate data and files dictionaries (as suggested here. I've tred the files dictionary with and without a filename.

Here is my code:

#!/usr/bin/env python

import requests
import os

last_pic = '/.../image.jpg'

USER = 'email@email.com'
PASS = 'mypass'

AUTH_URL = 'http://my.cl.ly'
API_URL = 'http://my.cl.ly/items/new'

s = requests.Session()
s.auth = requests.auth.HTTPDigestAuth(USER, PASS)
s.headers.update({'Accept': 'application/json'})

upload_request = s.get(API_URL)

upload_values = upload_request.json()['params']

filename = os.path.basename(last_pic)
upload_values['key'] = upload_values['key'].replace(r'${filename}', filename)

files = {'file': open(last_pic, 'rb')}

stuff = requests.post(upload_request.json()['url'], data=upload_values, files=files)
print(stuff.text)

According to InspectB.in, the only differences between the working (pycloudapp) post and my post is:

Every parameter in the pycloudapp post body has Content-Type: text/plain; charset=utf-8, but does not in my code. For example:

--d5e0c013a6de4105b07ac844eea4da6e
Content-Disposition: form-data; name="acl"
Content-Type: text/plain; charset=utf-8

public-read

vs. mine:

--b1892e959d124887a61143dd2b468579
Content-Disposition: form-data; name="acl"

public-read

The file data is different.

pycloudapp:

--d5e0c013a6de4105b07ac844eea4da6e
Content-Disposition: form-data; name="file"
Content-Type: text/plain; charset=utf-8

����JFIFHH���ICC_PROFILE�applmntrRGB XYZ �...

vs mine:

--b1892e959d124887a61143dd2b468579
Content-Disposition: form-data; name="file"; filename="20130608-ScreenShot-180.jpg"
Content-Type: image/jpeg

����JFIFHH���ICC_PROFILE�applmntrRGB XYZ �...

The headers are essentially identical except for:

pycloudapp:

Accept: application/json
Accept-Encoding: identity

Mine:

Accept: */*
Accept-Encoding: gzip, deflate, compress

Specifically, both are successfully registering as Content-Type: multipart/form-data

Thinking that the accept headers might be the important difference, I've tried adding in headers = {'accept': 'application/json', 'content-type': 'multipart/form-data'} (and both of those individually), with no luck. Unfortunately, if I modify the headers, it overwrites all of the headers and loses the multipart encoding.

I also am wondering if the file's Content-Type: image/jpeg in my post vs Content-Type: text/plain; charset=utf-8 in the working post might be the issue.

Apologies for such a long post, this has been driving me crazy, and thanks for any help you can provide.

Have you tried `data['file'] = open(last_pic, 'rb').read()`? — Thomas Fenzl, Jun 10 '13 at 21:22
Yes, with no luck. Also note that the post data from the file is identical between the script that works and the script that doesn't (I included the first line to suggest that it was posting identically but didn't explicitly state that). — n8henrie, Jun 10 '13 at 21:37

score 3 · Accepted Answer · edited May 23 '17 at 11:51

After several days, I finally figured out the (simple) problem. The CloudApp API requires a "GET" request to the "Location" header in Amazon's response.

Pycloudapp was working correctly because it properly authenticated the GET response with return json.load(self.upload_auth_opener.open(request)).

I'm not sure why I was able to post correctly using Postman without any authentication -- somehow it was properly following the GET without credentials, even though the CloudApp API specifies that following the redirect requires authentication.

I was unable to follow the redirect properly with Requests because I was posting unauthenticated values (if I continued the Session() with s.post, the auth headers throw an error because Amazon doesn't expect them), and therefore the subsequent GET was also unauthenticated. One very confusing part of the ordeal was that the POSTed images were not appearing in my CloudApp account. However, I later discovered that I could manually paste Amazon's response's "location" into a browser window, and suddenly the posted images appeared in my account. This made me realize that the POST was insufficient; an authenticated GET is required to complete the process.

I then discovered that I wasn't getting anything helpful from requests.post.headers. It took a few minutes to figure out that it was responding with the headers from the redirect (the 500 error from the GET it was following), and not from the POST. Once I added allow_redirects=False, I could properly access Amazon's response's "Location" header. I just fed that header back into my authenticated Session() and it finally worked.

One other thing that was instrumental to the process was this SO thread which taught me about logging with Requests.

Hope this explanation makes sense and helps someone else. I certainly learned a lot over the last few days. My code likely still needs refinement, and I want to do more testing with urllib.quote_plus and whether I need to be UTF-8 encoding things, but I'm out of uploads for today, so that will have to wait.

My current code:

#!/usr/bin/env python

import requests
from collections import OrderedDict
import keyring
import os

last_pic = '/path/to/image.jpg'

USER = 'myemail@email.com'
KEYCHAIN_SERVICE_NAME = 'cloudapp'

# replace with PASS = 'your_password' if you don't use keyring
PASS = keyring.backends.OS_X.Keyring.get_password(KEYCHAIN_SERVICE_NAME, USER)

AUTH_URL = 'http://my.cl.ly'
API_URL = 'http://my.cl.ly/items/new'

s = requests.Session()
s.auth = requests.auth.HTTPDigestAuth(USER, PASS)
s.headers.update({'Accept': 'application/json'})

upload_request = s.get(API_URL)
param_list = []
for key, value in upload_request.json()['params'].items():
    param_list.append((key.encode('utf8'), value.encode('utf8')))
data = OrderedDict(sorted(param_list))

filename = (os.path.basename(last_pic)).encode('utf8')
data['key'] = data['key'].replace(r'${filename}', filename)
files = {'file': (filename, open(last_pic,'rb').read()) }

stuff = requests.post(upload_request.json()['url'], data=data, files=files, allow_redirects=False)

s.get(stuff.headers['Location'])

Posting to CloudApp API (AWS) with Python Requests

1 Answers1