I found significant processing time difference in fillna for different column selection techniques of pandas dataframe.
Time taken for fillna
of dataframe, whose columns are selected using loc
df1 = df.copy()
t1 = time.time()
df1.loc[:, col] = df1.loc[:, col].fillna(method="ffill")
t2 = time.time()
print(t2-t1)
3.908552885055542
Time taken for fillna
of dataframe, whose columns are selected using square bracket:
df1 = df.copy()
t1 = time.time()
df1[col] = df1[col].fillna(method="ffill")
t2 = time.time()
print(t2-t1)
223.85472440719604
This post suggests column selection using loc and square bracket is similar:-
Selecting a list of columns (df[['A', 'B', 'C']] is the same as df.loc[:, ['A', 'B', 'C']] -> selects columns A, B and C)
Can anyone please help why there is time difference? Thanks!!