I have encountered a slight head scratcher when it comes to lists of Pandas objects and their loops. In some code I was working on, there were a few pandas dataframes which were placed into a list, so operations could be performed on all of them.
However, I noticed that certain operations, such as creating new columns, work in "naive" Python for loops
, whereas other operations, like reversing the orders of the dataframes,
- require explicit indexing, and
- do not effect the original dataframes (only their copies residing within the list).
I am seeking help in getting the second part of my MWE below working as easily as the first part, and also to gain insight into understanding what underlying logic is causing this discrepancy in the first place.
## Creating data
import pandas as pd
from io import StringIO
data = StringIO(
"""
date;time;random
2019-06-12;19:59:59+00:00;99
2019-06-12;19:59:54+00:00;200
2019-06-12;19:59:52+00:00;65
2019-06-12;19:59:34+00:00;140
"""
)
df = pd.read_csv(data, sep=";")
print(df)
## Creating list; there is only one dataframe in this list to make the
## code easier to work with, but in actuality I am working with >20 dataframes
df_list = [df]
## First operation - successfully adds new column to both original df and df_list[0]
for dataframe in df_list:
dataframe['date_time'] = pd.to_datetime(dataframe['date']+' '+dataframe['time'], utc=True)
print(df)
print(df_list[0])
## Second operation - successful only if using explicit indexing over list, first commented segment does nothing;
## using second segment works, but does not effect original df, only df_list[0].
# for dataframe in df_list:
# dataframe = dataframe.iloc[::-1]
# dataframe.reset_index(drop=True, inplace=True)
for i in range(len(df_list)):
df_list[i] = df_list[i].iloc[::-1]
df_list[i].reset_index(drop=True, inplace=True)
print(df)
print(df_list[0])