I am trying to append a row (df_row) with every iteration to a parent dataframe (df_all). The parent dataframe has all the possible column values and every iteration produces a row with a unique set of columns which are a subset of the all possible columns. It looks something like this:
df_all
initially has all the possible column names:
Index A B C D E F G H
Iteration 1:
df_row1:
Index A C D E F
ID1 1 2 3 5 1
df_all=df_all.append(df_row1)
Now df_all
looks as below:
df_all:
Index A B C D E F G H
ID1 1 na 2 3 5 1 na na
Iteration 2:
df_row2
:
Index A B D F G H
ID2 0 8 3 5 1 4
df_all=df_all.append(df_row2)
Now df_all
looks as below:
df_all:
Index A B C D E F G H
ID1 1 na 2 3 5 1 na na
ID2 0 8 na 3 na 5 1 4
And so on...
However, I have >20000 rows to add and the time taken to add every row is increasing with every new iteration. Is there a way to add this more efficiently within a reasonable amount of time?