2

First of all, thanks in advance.

I am trying to join two large pandas dataframes:

1.The two frames have the same number of rows. 2.There are some common columns (that also have the same column dataframe name) 3.The order of the common columns is not always the same (i.e column 'Hours' is the second one in the first dataframe but the first one in the second one)

What I want is to join both dataframes in such a way that:

  1. Common columns are merged
  2. I keep all the columns that are different in each dataframe.

I have seen some answers here related to the use of the pd.merge() function but in all of them the key of the column has to be supplied. I want to avoid that,since the number of columns is quite big. I imagine there must be a way to avoid writing, for instance 50 column keys.

Hope I explained myself properly,

Thanks a lot and have a great day!! :)

OMaeloc
  • 31
  • 3
  • 1
    Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – Umar.H Jun 23 '20 at 10:27
  • 1
    Hi, I have gone through I dont think it does specifically. That said, I think is a really interesting and useful resource so I will definitely save it for future reference, So Thanks a lot for sharing it !! :) – OMaeloc Jun 23 '20 at 11:04
  • the accepted answer is explicitly listed in the above post, I would list it as a duplicate so others who come across this post will be directed to a more thorough answer that has been accepted by the wider pandas expert community :) – Umar.H Jun 23 '20 at 11:05
  • Could you tell me exactly where? Because as far as I understood (from the images provided) the merge options work on rows. And according to pd.merge documentation there is no argument such as index=1. Thanks again , and sure, as soon as I see that it is repeated I will list as duplicate :) – OMaeloc Jun 23 '20 at 11:28
  • 1
    search for `pd.concat([left, right], axis=1, sort=False, join='inner')` its on the same post. :) – Umar.H Jun 23 '20 at 11:29
  • Hi again, when I try this it just concats one df after another, not combining the common columns – OMaeloc Jun 23 '20 at 11:59
  • @OMaeloc thanks, appreciate the complement on my post (Pandas Merging 101). – cs95 Jun 29 '20 at 09:26

1 Answers1

0

Its simple. Use "concat" and specify Join type. In your case, "inner"

Syntax:

pd.concat([df1, df2], axis=1, join='inner')
df1.merge(df2, left_on='ColumnA', right_on='ColumnB')

Column A is the left column and Column B is right

PerlBatch
  • 200
  • 2
  • 2
  • 10
  • Hi, I thought it worked sorry. But when I am applying it is just attaching one df after the other and not combining the common columns. – OMaeloc Jun 23 '20 at 11:25
  • Use df.merge option. Added a new line to the comment – PerlBatch Jun 23 '20 at 11:40
  • But in this way I do have to write the key for every common column no? What if I have 50 common columns, should I write the 50 keys? Thanks ;) – OMaeloc Jun 23 '20 at 11:50