0

So I'm trying to make a confidence graph using this as my base: https://www.pythoncharts.com/python/line-chart-with-confidence-interval/

In my code I have the user choose a folder with correctly formatted files that are then made into one big dataframe and passed to the function graph1. The problem I have is that whenever I call my function graph1 again the amount of memory usage spikes a lot (with over 100k+ data points across 100+ files it rises ~30MB each time it is called).

I have tried clearing the sublots and everything else related to the graphing, but to no avail. Also tried seeing if the problem was actually made in the function where all the data from the files is clumped into one dataframe, but everything there was cleared perfectly and calling it multiple times, didn't cause a memory bloat.

I'm not very confident in my Python skills so hopefully someone can point out what the problem might be.

def menu():
    confGUI = tk.Tk()
    confGUI.geometry("800x800")
    global figConf, graphConf, canvasConf
    figConf, graphConf = plt.subplots()
    canvasConf = FigureCanvasTkAgg(figConf, confGUI)
    canvasConf.get_tk_widget().pack()
    graph1(dataThatWasMadeElsewhere)
    
    confGUI.mainloop()

def graph1(df_combined):
    # Normalize time from 0 to 1
    df_combined['time'] = (df_combined['time']-np.min(df_combined['time']))/(np.max(df_combined['time'])-np.min(df_combined['time']))
    
    df_grouped = df_combined.groupby('time')['y'].agg(['mean', 'std'])
    df_grouped.reset_index(inplace=True)
    median_std = df_grouped['std'].median()
    df_grouped['std'].fillna(median_std, inplace=True) # changing NaN std values to median std

    # Smoothing out the mean into a rolling mean. Otherwise it looks bad on graph
    window_size = 100
    df_grouped['rolling_mean'] = df_grouped['mean'].rolling(window_size, center=True).mean()
    df_grouped['lower'] = df_grouped['mean']-df_grouped['std']
    df_grouped['upper'] = df_grouped['mean']+df_grouped['std']

    graphConf.set_xlabel('Time')
    graphConf.set_ylabel('Y')

    graphConf.fill_between(df_grouped['time'], df_grouped['lower'], df_grouped['upper'], color='red', alpha=0.2)
    # separate plots for each file
    for filename in set(df_combined['filename']):
        file_data = df_combined[df_combined['filename'] == filename]
        graphConf.plot(file_data['time'], file_data['Y'], color="gray", linewidth=0.5)
    graphConf.plot(df_grouped['time'], df_grouped['rolling_mean'], color='r', alpha=1, linewidth=5)

    canvasConf.draw()

if __name__ == "__main__":
    menu()
Galorch
  • 21
  • 2

1 Answers1

0

Finally found one stack overflow question that could help me: How can I release memory after creating matplotlib figures

Importing gc and calling gc.collect() at the end of my graphing function fixed it. Nothing else mentioned there and elsewhere seemed to work.

Galorch
  • 21
  • 2