20

I am trying to upload a CSV file, work on it to produce results, and write back (download) a new CSV file containing the result. I am very new to Flask and I am not able to get a "proper" csv.reader object to iterate and work upon. Here is the code so far,

__author__ = 'shivendra'
from flask import Flask, make_response, request
import csv

app = Flask(__name__)

def transform(text_file_contents):
    return text_file_contents.replace("=", ",")


@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>Transform a file demo</h1>

                <form action="/transform" method="post" enctype="multipart/form-data">
                    <input type="file" name="data_file" />
                    <input type="submit" />
                </form>
            </body>
        </html>
    """

@app.route('/transform', methods=["POST"])
def transform_view():
    file = request.files['data_file']
    if not file:
        return "No file"

    file_contents = file.stream.read().decode("utf-8")
    csv_input = csv.reader(file_contents)
    print(file_contents)
    print(type(file_contents))
    print(csv_input)
    for row in csv_input:
        print(row)

    result = transform(file_contents)

    response = make_response(result)
    response.headers["Content-Disposition"] = "attachment; filename=result.csv"
    return response

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=5001, debug=True)

The terminal output being

127.0.0.1 - - [12/Oct/2015 02:51:53] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [12/Oct/2015 02:51:59] "POST /transform HTTP/1.1" 200 -
4,5,6
<class 'str'>
<_csv.reader object at 0x105149438>
['1']
['', '']
['2']
['', '']
['3']
[]
['4']
['', '']
['5']
['', '']
['6']

Whereas the file I read is

enter image description here

What am I doing wrong to not get 2 lists representing 2 rows when I iterate the csv.reader object?

Brad Koch
  • 19,267
  • 19
  • 110
  • 137
Shivendra
  • 1,076
  • 2
  • 12
  • 26
  • Could you show what your csv file looks like in a text format and not in a spreadsheet program. Also what spreadsheet program are you using to generate the csv (if any). And one more thing, why is it necessary to replace `=` with `,`? Most csv dialects don't use equal signs. – iLoveTux Oct 11 '15 at 22:46
  • @iLoveTux Replacing `=` with `,` is completely unnecessary. Transform is a dummy function for now. But I couldn't reach till there as I am stuck at the csv reading part. – Shivendra Oct 11 '15 at 22:51
  • Added the text file image of the csv. – Shivendra Oct 11 '15 at 22:51
  • Formatting in the comments on this site is strange. Is that a newline between 3 and 4? By the results of iterating through your csv it would appear that there is a carriage return `\r` or a newline `\n` after each number, but if thats not the case, I would need an actual copy of the csv file to play around with in order to figure it out – iLoveTux Oct 11 '15 at 22:58
  • 1
    Yes, it is strange. I uploaded the file at https://www.dropbox.com/s/xmgn8n3o9leor12/blog1.csv?dl=0 – Shivendra Oct 11 '15 at 23:02
  • Don't use screenshots of files, paste the text instead. – nathancahill Oct 12 '15 at 15:19
  • @nathancahill I did that , but the comment section on SO seems to remove most of the formatting etc. – Shivendra Oct 12 '15 at 15:25

2 Answers2

68

OK, so there is one major problem with your script, csv.reader as noted here expects a file object or at least an object which supports the iterator protocol. You are passing a str which does implement the iterator protocol, but instead of iterating through the lines, it iterates through the characters. This is why you have the output you do.

First, it gives a singe character 1 which the csv.reader sees as a line with one field. After that the str gives another single character , which the csv.reader sees as a line with two empty fields (since the comma is the field seperator). It goes on like that throughout the str until it's exhausted.

The solution (or at least one solution) is to turn the str into a file-like object. I tried using the stream provided by flask.request.files["name"], but that doesn't iterate through the lines. Next, I tried using a cStringIO.StringIO and that seemed to have a similar issue. I ended up at this question which suggested an io.StringIO object in universal newlines mode which worked. I ended up with the following working code (perhaps it could be better):

__author__ = 'shivendra'
from flask import Flask, make_response, request
import io
import csv

app = Flask(__name__)

def transform(text_file_contents):
    return text_file_contents.replace("=", ",")


@app.route('/')
def form():
    return """
        <html>
            <body>
                <h1>Transform a file demo</h1>

                <form action="/transform" method="post" enctype="multipart/form-data">
                    <input type="file" name="data_file" />
                    <input type="submit" />
                </form>
            </body>
        </html>
    """

@app.route('/transform', methods=["POST"])
def transform_view():
    f = request.files['data_file']
    if not f:
        return "No file"

    stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)
    csv_input = csv.reader(stream)
    #print("file contents: ", file_contents)
    #print(type(file_contents))
    print(csv_input)
    for row in csv_input:
        print(row)

    stream.seek(0)
    result = transform(stream.read())

    response = make_response(result)
    response.headers["Content-Disposition"] = "attachment; filename=result.csv"
    return response

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=5001, debug=True)
Community
  • 1
  • 1
iLoveTux
  • 3,552
  • 23
  • 31
  • 8
    Thanks, I can't tell you how much I appreciate your effort. It worked. Not only you solved my problem, you helped me understand each line of the code properly. You are awesome. – Shivendra Oct 12 '15 at 06:24
  • 6
    Yup, thank you for your detailed response. Not easy to find a proper response on the web. You did it ;) – vinyll Jan 30 '17 at 21:32
8

Important note: This answer is relevant only for platforms where SpooledTemporaryFile is available.

Further to iLuveTux answer, you can save the redundant read() call by replacing the following string-based stream creation:

stream = io.StringIO(f.stream.read().decode("UTF8"), newline=None)

with:

stream = io.TextIOWrapper(f.stream._file, "UTF8", newline=None)

Example:

stream = io.TextIOWrapper(f.stream._file, "UTF8", newline=None)
csv_input = csv.reader(stream)
print(csv_input)
for row in csv_input:
    print(row)

Further information:

Werkzeug default stream for form data parser is SpooledTemporaryFile (as of 1.0.1), from which you can obtain the underlying buffer using its _file memeber.

Eido95
  • 1,313
  • 1
  • 15
  • 29