0

This question is related to How to concatenate combinations of rows from two different dataframes? but with a minor twist.

I have two dataframes with a common column. I want to create a new dataframe whose column names are the common column plus the concatenation of the two dataframes columns.

The resulting dataframe will have all possible combinations (cartesian product?) between rows of the two datasets that have the same value in the common column.

The two original datasets are:

    df1 = pd.DataFrame({'common': ['x', 'y', 'y'], 'A': ['1', '2', '3']})
    df2 = pd.DataFrame({'common': ['x', 'x', 'y'], 'B': ['a', 'b', 'c']})

and the resulting dataset would be:

    df3 = pd.DataFrame({'common': ['x', 'x', 'y', 'y'],
                        'A': ['1', '1' '2', '3'],
                        'B': ['a', 'b', 'c', 'c']})
user1883163
  • 133
  • 9
  • 1
    use : `df1.merge(df2,on='common')` – anky May 23 '19 at 15:16
  • 1
    [Here's some additional reading](https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html) from the pandas docs on merging, joining, and concatenation that you might find helpful. If you have common keys like this, you almost always want a merge or join rather than a concatenation. – G. Anderson May 23 '19 at 15:25
  • 1
    You are looking for "merge": https://stackoverflow.com/questions/53645882/pandas-merging-101/ – cs95 May 23 '19 at 15:27

1 Answers1

2

Use pandas' merge:

df1 = pd.DataFrame({'common': ['x', 'y', 'y'], 'A': ['1', '2', '3']})
df2 = pd.DataFrame({'common': ['x', 'x', 'y'], 'B': ['a', 'b', 'c']})
df3=pd.merge(df1,df2,on='common')
HMReliable
  • 871
  • 5
  • 11