0

I want to upload a file as JSON from the client to a Python webserver (Tornado) and save it on the server. This is my simplified setup:

Client HTML:

<input type="file" id="myFile" onchange="fileChange" />

Client JS:

function fileChange(event) {
    const file = event.target.files[0];
    const fileReader = new FileReader();
    fileReader.onload = (e) => uploadFile(e.target.result, file.name);
    fileReader.readAsText(file);
}

function uploadFile(fileContent, fileName) {
    const data = {fileContent, fileName};
    axios.post('http://localhost:8080/api/uploadFile', JSON.srtingify(data));
}

Python Webserver:

class UploadFileHandler(tornado.web.RequestHandler):

    def post(self):
        requestBody = tornado.escape.json_decode(self.request.body)
        file = open(requestBody["fileName"], "w+")
        file.write(requestBody["fileContent"].encode("UTF-8"))
        file.close()
  1. All uploaded files are empty (blank pages in a PDF, file type of JPG is 'not supported', Word file cannot be opened) and are nearly twice as big as the original file. How can I fix this?
  2. Is there a way to improve this setup?
cgdannie
  • 129
  • 1
  • 1
  • 8
  • Obviously you shouldn't re-encode anything you upload in utf-8 especially if the file is a binary. It makes no sense. – mpm Aug 31 '18 at 14:06
  • This sounds reasonable but without the encoding I get this error `UnicodeEncodeError('ascii', u'%PDF-1.5\r\n%\ufffd\...\n318578\r\n%%EOF', 11, 15, 'ordinal not in range(128)')"` – cgdannie Aug 31 '18 at 14:09

1 Answers1

1

You are trying to upload binary files (word, jpg), serialised as JSON, and store them on the server.

To handle binary data in JSON, encode the binary data as base64 first, then call JSON.stringify.

Like this (untested):

function uploadFile(fileContent, fileName) {
    // Encode the binary data to as base64.
    const data = {
        fileContent: btoa(fileContent),
        fileName: fileName
    };
    axios.post('http://localhost:8080/api/uploadFile', JSON.stringify(data));
}

On the server side, you need to deserialise from JSON, decode the base64 and the open a file in binary mode to ensure that what you are writing to disk is the uploaded binary data. Opening the file in text mode requires that the data be encoded before writing to disk, and this encoding step will corrupt binary data.

Something like this ought to work:

class UploadFileHandler(tornado.web.RequestHandler):

    def post(self):
        requestBody = tornado.escape.json_decode(self.request.body)
        # Decode binary content from base64
        binary_data = base64.b64decode(requestBody[fileContent])
        # Open file in binary mode
        with open(requestBody["fileName"], "wb") as f:
            f.write(binary_data)
snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
  • Thank you very much. This seems to be the solution. In my tests i needed to transform **utf-8** files first to ascii before using _btoa_: `btoa(unescape(encodeURIComponent(fileContent)))`. But I was then not able to **decode the file properly in python**. **Do you have any suggestions**? I tried different combinations of pythons _encode_ and _decode_ methods but got always some errors. – cgdannie Sep 02 '18 at 22:24
  • 1
    @cgdannie it should be sufficient to decode from base64, and then decode the resulting bytes from utf-8. If this doesn't work, open a new question with details and I can take a look. – snakecharmerb Sep 06 '18 at 04:37