I have a function that is returning a new homogenous dataframe everytime when running the code, now I want to store this dataframe

Question

I have a function

def Vega_dict():
    Sum_CE_Vega = Final_df.loc[Final_df['instrumentType'] == 'CE', 'vega'].sum()
    Sum_PE_Vega = Final_df.loc[Final_df['instrumentType'] == 'PE', 'vega'].sum()
    Vega_dict = []
    Vega_dict.append(Time)
    Vega_dict.append(Sum_CE_Vega)
    Vega_dict.append(Sum_PE_Vega)
    Vega_dict = np.transpose(Vega_dict)
    Vega_df = pd.DataFrame(Vega_dict)
    Vega_df = Vega_df.T
    Vega_df.columns = ["Time", "Sum of OTM CE", "Sum of OTM PE"]
    Vega_df = pd.concat([df_vg, Vega_df], ignore_index = True)
    return Vega_df

That return a dataframe which has the below output

       Time       Sum of OTM CE      Sum of OTM PE
0  20:14:32  176.90175243829978  166.3830493392582

This output will change in every 20second.

Now I need to store this dataframe so that I get a timeseries data like the below:

      Time       Sum of OTM CE      Sum of OTM PE
0  20:14:32  176.90175243829978  166.3830493392582
1  20:14:42  176.93993499404044  166.3783648784774
and so on...

I have created the below code to store that

df_vg = pd.DataFrame()
df_vg = Vega_dict() 
df_vg = pd.concat([Vega_dict(), df_vg], ignore_index = True)
print(df_vg)

But it is giving me the same output three times

       Time       Sum of OTM CE      Sum of OTM PE
0  20:30:49  176.90175243829978  166.3830493392582
1  20:30:49  176.90175243829978  166.3830493392582
2  20:30:49  176.90175243829978  166.3830493392582

And the output is also not getting stored please help me

Welcome to SoF. I think your problem could be solved by `df.drop_duplicates(keep=False, inplace=True)`. Please check this [answer](https://stackoverflow.com/a/50560138/10452700) for similar question\problrm and avoid duplicated questions. — Mario, Jul 29 '23 at 15:17
Hey @Mario the actual question is its not storing the value everytime I run the code, drop duplicates also help thanks for this....but main problem is to store the data — denise, Aug 07 '23 at 10:30
Pandas has `df.to_pickle("./dummy.pkl")` to store/save and `unpickled_df = pd.read_pickle("./dummy.pkl")` to read stored/saved df. Please check this post: [The Best Format to Save Pandas Data](https://towardsdatascience.com/the-best-format-to-save-pandas-data-414dca023e0d) to see which method is the good practice for the size and memory source of your data. — Mario, Aug 07 '23 at 12:11

I have a function that is returning a new homogenous dataframe everytime when running the code, now I want to store this dataframe

0 Answers0