1

I have a Dataframe in the below format

col_1, col_2, col_3
1, 2, 3
2, 3, 4
2, 3, 5

I am trying to check if the Dataframe has a set of columns if not I would like to create them as new columns in the Dataframe

cols_to_check = ['col_1', 'col_2', 'col_6', 'col_9']

For this I would like to ahead and create col_6 and col_9 since they do not exist in the Dataframe.

Final output:

col_1, col_2, col_3, col_6, col_9
1, 2, 3, 0, 0
2, 3, 4, 0, 0
2, 3, 5, 0, 0
Kevin Nash
  • 1,511
  • 3
  • 18
  • 37

1 Answers1

4

use reindex

cols_to_check = ['col_1','col_2', 'col_3', 'col_6', 'col_9']
df.reindex(columns=cols_to_check).fillna(0)

Just in case, you are not sure if all the df columns are included in the new list, leverage sets to check and add using set union

cols_to_check = ['col_1','col_2', 'col_3', 'col_6', 'col_9']
new_list =list(set(df.columns).union(cols_to_check))
new_df=df.reindex(columns=sorted(new_list)).fillna(0)
print(new_df)



   col_1  col_2  col_3  col_6  col_9
0      1      2      3    0.0    0.0
1      2      3      4    0.0    0.0
2      2      3      5    0.0    0.0
wwnde
  • 26,119
  • 6
  • 18
  • 32