0

I have a section of my code which needs to take the values from one dataframe, and apply it to another. So for example lets say 1 data frame is the scores of students dataframe, and the 2nd is the combination of students dataframe. I want to go through each combination_DF, get the students scores and then sum them up for that row.

print scores_DF

Name     Value
Dennis   39.66
James    45.38
Leo      40.63
Joe      20.10
etc...


print combination_DF

name1     name2     name3  
Dennis    James     Leo    
Leo       Joe       Dennis  

right now my program is looping through each combination_DF, finding the score for each name and adding it a column which will consist of the total score for each combination, which is really slowing down my program cause I work with thousands of entries. So it looks something like this....

    for index,row in combination_df.iterrows():
        value0 = scores_df[scores_df['Name'] == row[0]]
        value1 = scores_df[scores_df['Name'] == row[1]]
        value3 = scores_df[scores_df['Name'] == row[2]]
        total_score =  value0['Value'].values + value1['Value'].values+ value2['Value'].values

I'm new to Pandas and at the time it was the only way I knew how, but as my program has evolved this area of code needs to work faster if possible, Thanks.

R.J. Jackson
  • 125
  • 12

2 Answers2

0

I think you need groupby and aggregate sum first and then replace with sum:

s = scores_DF.groupby('Name')['Value'].sum()

combination_DF['sum'] = combination_DF.replace(s).sum(axis=1)

Althernative with map + stack + unstack:

combination_DF['sum'] = combination_DF.stack().map(s).unstack().sum(axis=1)

print (combination_DF)
    name1  name2   name3     sum
0  Dennis  James     Leo  125.67
1     Leo    Joe  Dennis  100.39

Detail:

print (combination_DF.replace(s))
   name1  name2  name3
0  39.66  45.38  40.63
1  40.63  20.10  39.66
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

You could go a bit fancier. First, let's create a function

f = lambda x: scores_DF.ix[x]["Value"]

Test it with f("Dennis")...

No need for iterrows:

combintation.apply(f, axis=1).sum(axis=1)

Should work More hardcore users insert f direct as argument for the apply function...

tschm
  • 2,905
  • 6
  • 33
  • 45