5

I'm trying to do a calculation based on information that I have in two different datasets. I need all the information for the first dataframe repetead as many times as the information of the second dataframe.

Example:

DataFrame 1:

name   price 
 A       1
 B       2

DataFrame 2:

 currency    value
  Dollar       1
  Euro         2 

Expected output:

DataFrame merged:

name   price   currency    Value
 A       1       Dollar      1
 A       1       Euro        1
 B       2       Dollar      4
 B       2       Euro        4

I've been trying something similar like this, with using an apply and a list, and then converting it into a DataFrame, but with no results. Somehow the currs values is always repeated for every name:

lst = []
for index, currs in currencies.iterrows():
    lst.append(prices.apply(lambda pmRow: pd.Series({'name':pmRow['prices'], 'curr':currs['currency']}), axis=1))

Any suggestions? Thanks!

Nico Gallegos
  • 321
  • 8
  • 20
  • 1
    Why is Value column 1,1,4,4, not 1,2,1,2? If later use `df1.assign(k=1).merge(df2.assign(k=1)).drop('k', 1)` – Zero Oct 09 '17 at 15:03

1 Answers1

4

It seems you need cross join:

df = pd.merge(df1.assign(A=1), df2.assign(A=1), on='A').drop('A', 1)
print (df)
  name  price currency  value
0    A      1   Dollar      1
1    A      1     Euro      2
2    B      2   Dollar      1
3    B      2     Euro      2

And then if necessary muliple columns price and value:

df['value'] *= df['price']
print (df)
  name  price currency  value
0    A      1   Dollar      1
1    A      1     Euro      2
2    B      2   Dollar      2
3    B      2     Euro      4
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252