1

How can I merge or concatenate multiple (>2) DataFrame objects forcing that the array dimensions match exactly, i.e. there is no data loss, no data filling, and an inner join and outer join would produce identical results, thus mimicking how numpy.concatenate works? I have read about pandas merging and pandas concatenation but couldn't find this use case addressed. From my code, I know that all my input DataFrame must have an equal number of rows and unique columns, and I'd like an exception if this is not the case. Of course, I could manually check for this and raise my own exception, but it would seem that it should be a somewhat common requirement (I expect this question to be a duplicate, but I couldn't find a post where it's answered). Can pandas do this for me?

df1 = pd.DataFrame(np.empty(shape=(5, 2)), columns=["A", "B"])
df2 = pd.DataFrame(np.empty(shape=(3, 2)), columns=["C", "D"])
df = pd.concat([df1, df2], axis=1, join="outer")  # fills with NaN
df = pd.concat([df1, df2], axis=1, join="inner")  # data loss
np.concatenate([df1.values, df2.values], axis=1)  # ValueError, desired behaviour (but no longer a DataFrame)

I desire a ValueError or other exception for this situation. Can pandas throw this for me or is it up to me?

gerrit
  • 24,025
  • 17
  • 97
  • 170
  • may be use `np` with a dataframe contructor? something like : `pd.DataFrame(np.concatenate([df1.values, df2.values], axis=1),columns=df1.columns.union(df2.columns))` ? – anky Apr 23 '20 at 08:51
  • @anky That would work, but I was expecting it to be possible in pandas without rebuilding the dataframe "manually". – gerrit Apr 23 '20 at 08:57
  • i see , I cant recollect any such inbuilt functions in pandas , guess you have to write a try and except to check the dimensions of the dataframes – anky Apr 23 '20 at 09:09

0 Answers0