0

I am querying several signals from InfluxDB to the Pandas dataframe every 5 minutes. If these signals meet a specific condition, I add the dataframes to the list. Then I want to take dataframes pair wise and perform a calculation on them.

df_list = data_pool() # data_pool() returns list of dataframes
for i in range(len(df_list)-1):
    for j in range(i+1, len(df_list)):
        calc(df_list[i], df_list[j])

In the calc(), I am manipulating the timestamps data. This logic work for df_list[0] and df_list[1], but for the next loop because df_list[0] is already manipulated, the calc() cannot be performed between df_list[0] and other dataframes.

How should I tackle this issue? Is generally adding pandas dataframe to a list a good idea?

I appreciate any idea or help.

Nili
  • 91
  • 8

1 Answers1

0

It took me a long time to find where the issue came from. The problem is that pandas dataframe got updated inside calc() function and affected the original one. To tackle this problem, I made a copy of dataframe inside the calc() function. I suggest reading this Pandas: Knowing when an operation affects the original dataframe.

Nili
  • 91
  • 8