My flask app pulls user data from an external API and does a small amount of processing and generates a dataframe. Although not really heavy there is a noticeable delay in loading the data onto a page. This same data is used by several pages on the app to display different aspects of the data. In order to speed things up I thought about saving the data temporarily (10 mins or so) and reusing this rather than grabbing the data on each page load.
A simplified function to gather the data is:
def get_dataset_from_wahoo(per_page=300, page=1):
connection = Connection.query.filter_by(user_id = current_user.id, provider = 'wahoo')
if connection.count() == 0: #does not exist - return to index and point to auth page
return render_template('home/index_noauth.html', title='Authentication Error')
#return a valid token object from the database
token = connection.first().token
expires = token['created_at'] + token['expires_in']
now = int(datetime.now().strftime("%s"))
if now > expires: #token has expired, get a new one
token = refresh_token(token)
connection.one().token = token
db.session.commit()
#request the data using the valid token
header = {'Authorization': 'Bearer ' + token['access_token']}
param = {'per_page': per_page, 'page': page}
dataset = requests.get(api_url + '/workouts', headers=header, params=param).json()
#create a dataframe from the returned dataset
df = pd.json_normalize(dataset['workouts'])
for col in df:
df[col]=pd.to_numeric(df[col], errors='ignore')
for col in df.select_dtypes(include=['object']):
df[col]=pd.to_datetime(df[col], errors='ignore')
return df
I can cache this result using Flask-Caching memoize and this seems to speed things up. I am unclear how the filesystem cache is cleared though? Do I just have to run a clean up task on the server? Or would a better approach be to store the dataframe in a Flask-session (it has a maximum size of around 200kB from my initial testing)?
Any advice is appreciated as i am quite new to flask.
Martyn