-3

I would like to union multiple Dataframes with different structure in Python without a key.

Example: Input dataframes as below

DF1

col1 col2 col2
abc aaa bbb
bcd bbb ccc

DF2

col4 col5 col6
cde ccc ddd
def ddd eee

Result should be: DF3

col1 col2 col3 col4 col5 col6
abc aaa bbb --- --- ---
bcd bbb ccc --- --- ---
--- --- --- cde ccc ddd
--- --- --- def ddd eee

Is there an easy way to achieve this?

James Z
  • 12,209
  • 10
  • 24
  • 44
Mahesh D
  • 1
  • 2

2 Answers2

0

I suppose you are trying to do row bind. You can try this:

import pandas as pd
pd.concat([df1, df2]))

or you can use this:

output = df1.append(df2)

Update:

First match and convert the data types to match other data frame:

df2 = df2.astype(df1.dtypes.to_dict())

and then do this

pd.concat([df1, df2]))

You can also use this one liner if you have same columns in both

df2.astype(df1.dtypes).append(df2)

This is a possible duplicate of Pandas version of rbind

Rushabh Patel
  • 2,672
  • 13
  • 34
  • Thank you for the quick answer. How can we handle with mixed types while doing union? Example: Column b in DF1 is Int but Column b in DF2 is String While concatenating the above column from both datasets, it is throwing below error "pyarrow.lib.ArrowTypeError: ("Expected bytes, got a 'int' object", 'Conversion failed for column b with type object')" Much appreciate your help :) – Mahesh D Aug 10 '21 at 13:19
  • @MaheshDonthireddy You need to convert the data types of one data frame to match the other and then row bind it. – Rushabh Patel Aug 10 '21 at 13:26
  • Thanks again! In our case, we are dealing with close to thousands of columns, so I guess I might need to loop through each merged/concatenated dataframe and assign it's data type to the new dataframe. I imagine it might make the things complicated. Kindly suggest :) – Mahesh D Aug 10 '21 at 13:30
  • @MaheshDonthireddy Check my updated answer, I have added one liner solution if you have same columns in both the data frames. – Rushabh Patel Aug 10 '21 at 16:46
  • @RushabPatel The above syntax was throwing error. I had to implement looping mechanism to check common columns dtype and assign string if dtype doesn't match. I know it is costly operation. Please let me know if you come across an easy solution for this problem in future. – Mahesh D Aug 11 '21 at 12:50
0

I had to implement looping mechanism to check common columns dtype and assign string if dtype doesn't match. It works fine for time-being. Please post here if anyone come across an easy solution for this problem in future.

Mahesh D
  • 1
  • 2