1

I know that there are some similar questions, although none of them has helped me to resolve my issue. The ultimate goal is to have an flask app, which takes an excel file, stores it inside of the azure blob storage, which is then used by my python function app to do some further transformations. What I struggle with, is how to encode/decode this file as I have to use block_blob_service.create_blob_from_bytes() function.

The only thing that came to my mind was to use Pandas library to read this excel and then tobytes() function. This way, I am able to upload my excel to the blob as a CSV file. However I cannot really convert it to its previous form.

This is how it looks like after opening :

9]�0��j�9p/�j���`��/wj1=p/�j��p�^�.wj2=p/�[...]

Trying to decode it with utf-8 gives me some never ending errors saying that 'utf-8' codec can't decode byte[...] I have tried many different encodings but it always ends up with this message at some byte. Excel contains numericals, strings and dates.

So, to the code:

#getting the file
file = request.files['file']

#reading into pandas df
data = pd.read_excel(file)

df_to_records = data.to_records(index=False)

records_to_bytes = df_to_records.tobytes()

block_blob_service = BlockBlobService(account_name='xxx', account_key="xxx")

block_blob_service.create_blob_from_bytes("test","mydata.csv",records_to_bytes)

Thank you for any advice!

Grevioos
  • 355
  • 5
  • 30
  • https://stackoverflow.com/questions/1035340/reading-binary-file-and-looping-over-each-byte – Joe May 22 '19 at 19:13

1 Answers1

1

Update: The full code working at my side.

import os
from flask import Flask, request, redirect, url_for
from azure.storage.blob import BlockBlobService
import string, random, requests

app = Flask(__name__, instance_relative_config=True)

account = "your_account name"   # Azure account name
key = "account key"      # Azure Storage account access key  
container = "f22" # Container name

blob_service = BlockBlobService(account_name=account, account_key=key)

@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        file = request.files['file']
        file.seek(0)
        filename = "test1.csv" # just use a hardcoded filename for test
        blob_service.create_blob_from_stream(container, filename, file)
        ref =  'http://'+ account + '.blob.core.windows.net/' + container + '/' + filename
        return '''
        <!doctype html>
        <title>File Link</title>
        <h1>Uploaded File Link</h1>
        <p>''' + ref + '''</p>
        <img src="'''+ ref +'''">
        '''

    return '''
    <!doctype html>
    <title>Upload new File</title>
    <h1>Upload new File</h1>
    <form action="" method=post enctype=multipart/form-data>
      <p><input type=file name=file>
         <input type=submit value=Upload>
    </form>
    '''
if __name__ == '__main__':
    app.run(debug=True)

After running:

enter image description here

After select a .csv file and click upload button, check the .csv file in azure portal:

enter image description here


I think you can take a try with method create_blob_from_stream instead of create_blob_from_bytes.

Here is the sample code:

def upload_file():    
    if request.method == 'POST':    
        file = request.files['file']
        file.seek(0)                
        try: 
            blob_service = BlockBlobService(account_name='xxx', account_key="xxx")   
            blob_service.create_blob_from_stream(container, filename, file)    
        except Exception:    
            print 'Exception=' + Exception     
            pass
Ivan Glasenberg
  • 29,865
  • 2
  • 44
  • 60