1

In our flask API, we have an endpoint for uploading a document. This endpoint receives the uploaded document, does some checks and uploads this document to MinIO. We want to stream the received document directly to the remote storage without actually processing it via the API.

One idea is to do the following:

@app.route('/upload', methods=['POST'])
def upload():
   upload_url = get_presigned_upload_url()
   response = requests.put(url=upload_url, files={'file': request.stream})  

I can see two problems in the above code:

1- How can I make sure that the stream consists of only one file

2- How can I extract the mimetype from the actual file, without just trusting the file extension

  • Correct me if I am wrong here in my understanding, but what currently the /upload url receives a file, does some processing of the file within do_something() and then generates a url to be uploaded to MinIo. You want to just upload the file and it will be directly streamed to min_io without being proccessed by do_something() ? – Akib Rhast May 29 '20 at 12:19
  • No, it does not touch the file itself. Lets say it does some security checks. Second part of you comment is exactly what I want to achieve – Karim Akhnoukh May 29 '20 at 12:27
  • I removed the do_something() part as I though it might be confusing. – Karim Akhnoukh May 29 '20 at 12:31
  • You seem to be on the right path. In regards to your point (1) and (2) , for (1) you can restrict your flask api route to not accept "multipart/form-data" in the request header. For (2) take a look at the following answer https://stackoverflow.com/questions/481743/how-can-i-determine-a-files-true-extension-type-programmatically – Akib Rhast May 29 '20 at 13:09
  • hmm but I want this upload endpoint to receive a file. How can I then receive this file after disabling multipart/form-data? – Karim Akhnoukh May 29 '20 at 13:18
  • Maybe with multipart/octet-stream .. ? I tried a bunch of things on my end.. nothing seems to work out tough.. @me if you find the solution, or post the answer yourself. I am curious – Akib Rhast May 29 '20 at 13:51
  • I will, thanks for your help anyways :) – Karim Akhnoukh May 29 '20 at 15:53

1 Answers1

1

Following the suggestion in here for uploading large files, I was able to get all the info that I need including mimetype and number of files uploaded.

The code that I have used:

@app.route('/upload', methods=['POST'])
def upload():
   upload_url = get_presigned_upload_url()

   def custom_stream_factory(total_content_length, filename, content_type, content_length=None):
    import tempfile
    tmpfile = tempfile.NamedTemporaryFile('wb+', prefix='flaskapp', suffix='.nc')
    return tmpfile

    import werkzeug, flask
    stream, form, files = werkzeug.formparser.parse_form_data(flask.request.environ,                                     
       stream_factory=custom_stream_factory)

     files_list = []
     for fil in files.values():
         files_list.append(
           ('file', (fil.filename, fil, fil.mimetype))
         )
         break
     response = requests.put(url=upload_url, files=files_list)