1

Am using Bottle to create an upload API. The script below is able to upload a file to a directory but got two issues which I need to address. One is how can I avoid loading the whole file to memory the other is how to set a maximum size for upload file?

Is it possible to continuously read the file and dump what has been read to file till the upload is complete? the upload.save(file_path, overwrite=False, chunk_size=1024) function seems to load the whole file into memory. In the tutorial, they have pointed out that using .read() is dangerous.

from bottle import Bottle, request, run, response, route, default_app, static_file
app = Bottle()

@route('/upload', method='POST')
def upload_file():
    function_name = sys._getframe().f_code.co_name
    try:
        upload = request.files.get("upload_file")
        if not upload:
            return "Nothing to upload"
        else:
            #Get file_name and the extension
            file_name, ext = os.path.splitext(upload.filename)
            if ext in ('.exe', '.msi', '.py'):
                return "File extension not allowed."

            #Determine folder to save the upload
            save_folder = "/tmp/{folder}".format(folder='external_files')
            if not os.path.exists(save_folder):
                os.makedirs(save_folder)

            #Determine file_path    
            file_path = "{path}/{time_now}_{file}".\
                        format(path=save_folder, file=upload.filename, timestamp=time_now)

            #Save the upload to file in chunks            
            upload.save(file_path, overwrite=False, chunk_size=1024)
            return "File successfully saved {0}{1} to '{2}'.".format(file_name, ext, save_folder)

    except KeyboardInterrupt:
        logger.info('%s: ' %(function_name), "Someone pressed CNRL + C")
    except:
        logger.error('%s: ' %(function_name), exc_info=True)
        print("Exception occurred111. Location: %s" %(function_name))
    finally:
        pass

if __name__ == '__main__':
    run(host="localhost", port=8080, reloader=True, debug=True)
else:
    application = default_app()

I also tried doing a file.write but same case. File is getting read to memory and it hangs the machine.

file_to_write = open("%s" %(output_file_path), "wb") 
while True:
    datachunk = upload.file.read(1024)
    if not datachunk:
        break
    file_to_write.write(datachunk)

Related to this, I've seen the property MEMFILE_MAX where several SO posts claim one could set the maximum file upload size. I've tried setting it but it seems not to have any effect as all files no matter the size are going through.

Note that I want to be able to receive office document which could be plain with their extensions or zipped with a password.

Using Python3.4 and bottle 0.12.7

Community
  • 1
  • 1
lukik
  • 3,919
  • 6
  • 46
  • 89
  • lukik, were you able to solve this? – ron rothman Nov 16 '14 at 05:29
  • Hi.No. It still was loading the whole file to memory despite the suggestions below. I ended up putting an SFTP box for clients to connect however I still need to get this working as I need control on the upload process.. Any new suggestions? – lukik Nov 16 '14 at 06:09

1 Answers1

1

Basically, you want to call upload.read(1024) in a loop. Something like this (untested):

with open(file_path, 'wb') as dest:
    chunk = upload.read(1024)
    while chunk:
        dest.write(chunk)
        chunk = upload.read(1024)

(Do not call open on upload; it's already open for you.)

This SO answer includes more example sof how to read a large file without "slurping" it.

Community
  • 1
  • 1
ron rothman
  • 17,348
  • 7
  • 41
  • 43
  • Am trying to run the `f = open(upload.file)` but I get a type_error `invalid file: <_io.BufferedRandom name=8>`. Challange The challenge I have is how to convert the upload file `upload = request.files.get("upload_file")` into a file that can be read using `open('really_big_file')` – lukik Nov 05 '14 at 12:41
  • You shouldn't be calling `open` on `upload.file`; it's already a file-like object that's ready for you to use. Just read from it. – ron rothman Nov 05 '14 at 16:07
  • @b4hand, I call `open` to create the *destination* file, not to read from `upload`. – ron rothman Nov 05 '14 at 16:50