Concatenating pandas dataframes by column with some indices missing

Asked Apr 17 '23 at 13:30

Active Apr 17 '23 at 13:30

Viewed 16 times

I have two pandas df with the following structure. Let's say df_1 is:

Index	Column_A	Column_B
Index_1	4	8
Index_2	7	1
Index_3	5	9
Index_4	4	8
Index_5	2	3

and df_2 is:

Index	Column_C	Column_D
Index_1	11	25
Index_4	23	16
Index_5	12	42

I want to concatenate the columns in the corresponding indices, so that the final df would be:

Index	Column_A	Column_B	Column_C	Column_D
Index_1	4	8	11	25
Index_2	7	1	NaN	NaN
Index_3	5	9	NaN	NaN
Index_4	4	8	23	16
Index_5	2	3	12	42

But doing the following:

df = pd.concat([df1, df2], axis=1)

Outputs the error:

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Is there a way to make pandas ignore the indices of df2 not present in df1 and do the concatenation asigning NaNs to them? BTW in my real problem I have lots of dfs to concatenate, so it would be fantastic to have some option and just put the list of dfs to concatenate, while preserving the indices of the first one:

df = pd.concat([df1, df2, df3...], axis=1)

asked Apr 17 '23 at 13:30

datadatadata

concat fails because one of the two data frame has duplicate index items. – Quang Hoang Apr 17 '23 at 13:32

Concatenating pandas dataframes by column with some indices missing

0 Answers0