6

I am working on a web app where a user will upload a .csv file, which is rendered into html on the next page. Then, that same .csv file (or the pandas dataframe that it was imported into) needs to be used on the following page. So I need to move this object between @app.routes. My understanding is that session is the right way to do this in Flask. However, session requires that the object be serialized.

That would be fine, except that when converting the json back to pandas, underscores are removed. Apparently this is because it is being seen as a numeric object with the _ playing the role of a comma, and the developers have indicated that they do not plan to provide a fix for this.

I made a simple app that demonstrates the issue:

controller.py

#!/usr/bin/env python3
import pandas as pd
from flask import Flask, render_template, session
import os


app = Flask(__name__)


app.secret_key = os.urandom(28)


@app.route('/first_page', methods=['GET', 'POST'])
def first_page():
    d = {'products': ['pencils', 'pens', 'erasers'], 'id_code': ['1_2', '10_7', '12_11']}
    df = pd.DataFrame(d)
    print(df)
    session["data"] = df.to_json()
    return render_template('/private/test_page1.html')

@app.route('/second_page', methods=['GET', 'POST'])
def second_page():
    dat = session.get('data')
    dat = pd.read_json(dat)
    print(dat)
    return render_template('/private/test_page2.html')




if __name__ == '__main__':
    app.run(port=5001,debug=True)

Output in console, from which you can see that the underscores have been removed.

  products id_code
0  pencils     1_2
1     pens    10_7
2  erasers   12_11
127.0.0.1 - - [01/Apr/2019 22:49:27] "GET /first_page HTTP/1.1" 200 -
  products  id_code
0  pencils       12
1     pens      107
2  erasers     1211
127.0.0.1 - - [01/Apr/2019 22:49:28] "POST /second_page HTTP/1.1" 200 -

So, is there a better solution than simply "remove all underscores from my data before importing"? I could do that, but it would be a pain since other code I have already written expects the data with underscores.

Edit: and what if I also have null values in there? Can I just avoid using json entirely?

Stonecraft
  • 860
  • 1
  • 12
  • 30
  • you can also read the json into a dict, and use `pd.DataFrame` the same way you read it the first time – njzk2 Apr 02 '19 at 03:46
  • Ok yeah that works too. Ideally though, I would like to altogether avoid json and keep it as a pandas dataframe. I am now having some other problem with the json string, which unfortunately was not solved by either of our solutions. I think this time it has to do with null values, so I'm really just not liking json right now. – Stonecraft Apr 02 '19 at 04:15

1 Answers1

3

With respect to the underscores, the trick is to specify dtype=False when converting back to a data frame. This prevents pandas from incorrectly treating the column in question as numeric. The following worked as intended:

#!/usr/bin/env python3

import pandas as pd
from flask import Flask, render_template, session
import os

app = Flask(__name__)


app.secret_key = os.urandom(28)


@app.route('/first_page', methods=['GET', 'POST'])
def first_page():
    d = {'products': ['pencils', 'pens', 'erasers'], 'id_code': ['1_2', '10_7', '12_11']}
    df = pd.DataFrame(d)
    print(df)
    session["data"] = df.to_json()
    return render_template('/private/test_page1.html')

@app.route('/second_page', methods=['GET', 'POST'])
def second_page():
    dat = session.get('data')
    dat = pd.read_json(dat, dtype=False)
    print(dat)
    return render_template('/private/test_page2.html')




if __name__ == '__main__':
    app.run(port=5001,debug=True)

console output:

127.0.0.1 - - [01/Apr/2019 23:38:56] "GET / HTTP/1.1" 404 -
  products id_code
0  pencils     1_2
1     pens    10_7
2  erasers   12_11
127.0.0.1 - - [01/Apr/2019 23:39:03] "GET /first_page HTTP/1.1" 200 -
  products id_code
0  pencils     1_2
1     pens    10_7
2  erasers   12_11

However I still can't deal with null values.

Stonecraft
  • 860
  • 1
  • 12
  • 30