Convert xlsx sent via html form to csv/tsv in python

Question

I have an HTML form using which user can upload a .xlsx file and calculate pay rates for shifts in .xlsx file(see below)

<form enctype="multipart/form-data" action="/calculate_payrate" method="post">
    <label class="label">PLEASE UPLOAD FILE HERE:</label>
    <input type="file" id="roaster_file"  name="roaster_data"></br>
    <input type="submit" id="calculate_input" name="submit" value="CALCULATE PAY" >
</form>

Here is the python code

@view_config(route_name='calculate_payrate', 
renderer='../templates/pay/pay.mako')
def calculate_payrate(request):
    roaster_data = request.POST['roaster_data']
    roaster_data = roaster_data.value
    roaster_data = roaster_data.decode('utf-8')
    ## code to do calculate pay

The problem is - when a user uploads CSV or TSV file, I can get the data in files easily and perform my calculations. But when xlsx file is uploaded and I get below data

 FieldStorage('roaster_data', 'abc.xlsx')

b'PK\x03\x04\x14\x00\x06\x00\x08\x00\x00\x00!\x00b\xee\x9dh^\x01\x00\x00\x90\x04\x00\x00\x13\x00\x08\x02[Content_Types].xml and so on...

I am not sure how can I get the data in xlsx file [assuming there is only one sheet in xlsx].

All the questions that I have seen so far about reading data from xlsx require file path and open that file using pandas or some other package. But however I cannot place the file on the Heroku server that I am using and I need to send my file data via the form.

Any pointers are appreciated.

Thanks.

score 0 · Answer 1 · answered Feb 12 '19 at 01:10

0

xlsx files are usually saved with the windows-1252 encoding. latin-1 works well too, based on my experience.

Further discussions on xlsx encoding:

You need to update your FieldStorage class to use a different encoding to parse the uploaded file if it's of the xlsx type.

answered Feb 12 '19 at 01:10

kerwei

1,822
1
13
22

If i use 'windows-1252' I get error `UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 634: character maps to ` and when I use latin-1 the data looks like this `|£&ºðsváµ¹üø6º&ºðsðpA°QK]ø9»ðZèÂÏ` – ghanisht nagpal Feb 12 '19 at 01:24
Please check out @pi's answer in the first link above. You can use the chardet library to detect the file encoding first to use the right one – kerwei Feb 12 '19 at 01:32

Convert xlsx sent via html form to csv/tsv in python

1 Answers1