0

I'm using the python3 requests library to post files via a REST-API with this (snippet) code:

headers = self.getHeaders()
data = {'options': '{"category":"' + category + '", "keywords":"' + keywords + '", "collectionId":"' + kolid + '", "fileName": "' + filename.decode('utf-8') + '", "dynamicMetadata":' + meta.decode('utf-8') + '}'}
query = {'file': ("'./" + filepath + "'", open(filepath, 'rb'))}
response = requests.post(self.getHost() + "files", headers=headers, data=data, files=(query))

While this did upload almost all files, some of them were left out as the api returned a error saying that it can't process the filename. When watching the logs, all these files have UTF-8 special chars (german umlaute) like "Ä Ö Ü <spaces> " in their filenames. For some reason, my upload path (so the 'query' dictionary) contained unicode characters instead of the decoded UTF-8 special chars.

For example on my linux system the file is stored under pictures/attachments/Briefing E-Mailing Verlängerung.pdf while in the log, the path appears as "pictures/attachments/Briefing E-Mailing Verl\u00e4ngerung.pdf"

When printing the filepath in python (print filepath), I get a clean output, so the filepath/filename must be corrupted by the requests command?

(Versions):

Python 3.5.3 , requests (2.12.4) , Debian 9.13 4.9.0-14-amd64

mbdotnet
  • 1
  • 1

1 Answers1

0

This is question of decoding on server part, I guess.

For me it seems like requests does encoding for filename header for you.


I'd also suggest reading through How to encode the filename parameter of Content-Disposition header in HTTP?, answers have a nice collection of related RFCs.

Slam
  • 8,112
  • 1
  • 36
  • 44