1

I have two pandas dataframes that have the same column names and the data relates to different product types in each dataframe. For example, both look like this but are unequal in lengths.

df1

Name    ScoreX ScoreY   ScoreZ
Type1   0.6     0.2      0.7
Type2   0.6     0.5      0.6
Type3   0.7     0.2      0.3
Type4   1.0     0.2      0.3

df2

Name    ScoreX ScoreY   ScoreZ
TypeA   0.5     0.1      0.9
TypeB   0.3     0.5      0.6
TypeC   0.7     0.8      0.2
TypeD   1.0     0.2      0.3

I am trying to create new hybrid values of each type which combines each product in df1 with df2 and the score values are the means on both product's individual scores. Looking for a way to iterate both frames so the first row of df1 is combined with each row in df2 then this process is repeated for each row again in df1. So the output would look like this in a new df:

Name        MeanScoreX MeanScoreY MeanScoreZ
Type1_TypeA  0.55       0.15      0.8
Type1_TypeB  0.45       0.35      0.65

.......

Type2_TypeA  0.55       0.3       0.75
Type2_TypeB  0.45       0.5       0.6
Em C
  • 13
  • 2
  • Does this answer your question? [cartesian product in pandas](https://stackoverflow.com/questions/13269890/cartesian-product-in-pandas) – AMC Feb 27 '20 at 00:09
  • Thanks @AMC - this helped towards answering in the right direction using cartesian product – Em C Feb 27 '20 at 01:59

1 Answers1

0

rename columns of df2

df2.columns = ['Name2', 'ScoreX2' , 'ScoreY2', 'ScoreZ2']

concatenate both data frames

df = pd.concat([df1, df2], axis=1)

generate derived fields

df['Name'] = df['Name'] + df['Name2']
df['MeanScoreX'] = df[['ScoreX', 'ScoreX2']].mean(axis=1)
df['MeanScoreY'] = df[['ScoreY', 'ScoreY2']].mean(axis=1)
df['MeanScoreZ'] = df[['ScoreZ', 'ScoreZ2']].mean(axis=1)
Mayowa Ayodele
  • 549
  • 2
  • 11
  • Thanks @MayowaAyodele - you answer was a help in the right direction to getting where I wanted – Em C Feb 27 '20 at 01:56