-1

I have multiple csv files with two columns of values like this:

enter image description here

I use the following python code to calculate the R2 value and plot these data.

import numpy as np
import pandas as pd
import glob
import matplotlib.pyplot as plt

for filepath in glob.iglob(r'*.csv'):
    print(filepath)
    df = pd.read_csv(filepath)
    x_values = df["LMP"]
    y_values = df["LMP_old"]

    correlation_matrix = np.corrcoef(x_values, y_values)
    correlation_xy = correlation_matrix[0,1]
    r_squared = correlation_xy**2
    plt.scatter(x_values,y_values)
    plt.xlabel('Predicted LMP')
    plt.ylabel("Actual LMP")
    plt.title(r_squared)
    plt.xlim(20000, 26000)
    plt.ylim(20000, 26000)
    x = np.linspace(20000, 26000)
    plt.plot(x, x, linestyle='solid')
    plt.grid(True)
    plt.savefig(filepath+".png")
    print(r_squared)
    with open(filepath+".txt", "w") as text_file:
         print(f"{r_squared}", file=text_file)

But I found the x_values and y_values will not be reseted after each loop, but will remember the values from last loop and keep accumulating. What command is needed so that x_values and y_values will be independent/reseted after each loop?

Thank you very much.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Lenoir
  • 113
  • 5
  • What makes you think that `x_values` and `y_values` are not being reset? Can you create a [mcve] that demonstrates the problem? – 0x5453 Oct 01 '20 at 16:55
  • @0x5453I found the data from all previous csv files will be plotted together with the present one. – Lenoir Oct 01 '20 at 18:10
  • Does this answer your question? [How do I tell matplotlib that I am done with a plot?](https://stackoverflow.com/q/741877/3282436) – 0x5453 Oct 01 '20 at 18:17
  • @0x5453 Thanks. I find the solution there. Just add```plt.close()``` after ```plt.savefig(filepath+".png")``` – Lenoir Oct 01 '20 at 18:44

1 Answers1

0
import numpy as np
import pandas as pd
import glob
import matplotlib.pyplot as plt

for filepath in glob.iglob(r'*.csv'):
    print(filepath)
    df = pd.read_csv(filepath)
    x_values = df["LMP"]
    y_values = df["LMP_old"]

    correlation_matrix = np.corrcoef(x_values, y_values)
    correlation_xy = correlation_matrix[0,1]
    r_squared = correlation_xy**2
    plt.scatter(x_values,y_values)
    plt.xlabel('Predicted LMP')
    plt.ylabel("Actual LMP")
    plt.title(r_squared)
    plt.xlim(20000, 26000)
    plt.ylim(20000, 26000)
    x = np.linspace(20000, 26000)
    plt.plot(x, x, linestyle='solid')
    plt.grid(True)
    plt.savefig(filepath+".png")
    
    plt.close()  # Adding this code solves the issue.

    print(r_squared)
    with open(filepath+".txt", "w") as text_file:
         print(f"{r_squared}", file=text_file)
Lenoir
  • 113
  • 5