2

I have a data frame like this and I would like to multiply the rows by its references value in another data frame using pandas. After that I would like to add all products from those rows and store it in a column called Pro_Sum. I know Python doesn't pass things by reference, but I can name-bind in here so I'm trying to create a dictionary to multiply the values according to its name, but I haven't been successful.

df = pd.read_excel (C:/"dummy")
d = {"C1": 2, "C2": 5,"C3":4, "C4":1}
df.mul(pd.Series(d), axis=1)

prod = d.keys() 
df[prod] = df[prod].mul(pd.Series(d), axis=1)[prod]
to_sum_t = list(df)
#.sum ignores the none numeric values.
df['Pro_sum'] = df[to_sum_t].sum(axis=1)

Input

ID  AU  HP  Name1   Value1  CHC1    Name2 Value2 CHC2   Name3 Value3 CHC3 
1   4   3   C1        10    100      0      0       0     0      0      0       
2   6   2   C2        20    95      C1      6       5     0      0      0    
3   2   7   C3        4     40      C4      6       60    0      0      0   
4   8   9   C1        8     100      0      0       0     0      0      0    
5   2   6   C1        6     10      C2      15      86   C4      1      4    

Reference column

Names Values
  C1    2
  C2    5
  C3    4
  C4    1
  Example row 5 Pro_Sum = 6*2 + 15*5 + 1*1 = 88

Minimal Output

ID  AU  HP  Name1 Value1 CHC1   Name2 Value2 CHC2   Name3 Value3 CHC3 Pro_Sum
1   4   3   C1      10    100      0    0     0       0      0      0    20 
2   6   2   C2      20    95      C1    6     5       0      0      0    112
5   2   6   C1       6    10      C2    15    86      C4     1      4    88
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Zesima29
  • 184
  • 12
  • Not sure if understand, `df[prod]` cannot select columns names in sample data, because not exist same as keys as dict. – jezrael Jun 26 '18 at 13:25
  • so should I rename all my columns' header to the name that I want refer in the dictionary? – Zesima29 Jun 26 '18 at 13:32

1 Answers1

1

I think need filter columns by names, replace values by dictionary and convert to numpy arrays by values, multiple, sum and assign to new column:

d = {"C1": 2, "C2": 5,"C3":4, "C4":1}

a = df.filter(like='Name').replace(d).astype(int).values
b = df.filter(like='Value').values

df['Pro_Sum'] = (a * b).sum(axis=1)
print (df)
   ID  AU  HP Name1  Value1  CHC1 Name2  Value2  CHC2 Name3  Value3  CHC3  \
0   1   4   3    C1      10   100     0       0     0     0       0     0   
1   2   6   2    C2      20    95    C1       6     5     0       0     0   
2   3   2   7    C3       4    40    C4       6    60     0       0     0   
3   4   8   9    C1       8   100     0       0     0     0       0     0   
4   5   2   6    C1       6    10    C2      15    86    C4       1     4   

   Pro_Sum  
0       20  
1      112  
2       22  
3       16  
4       88
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Okay I'm trying to use the code and I'm getting this error that says `invalid literal for long() with base 10: ' C1'` I don't why it doesn't recognize C1 as a string. – Zesima29 Jun 26 '18 at 13:55
  • 1
    @Zesima29 - There is whitespace before `' C1'`, if use previous [solution](https://stackoverflow.com/q/51025752/2901002) need `df['Zone'] = df['Zone'].str.strip().map(d)` and then in this solution is possible remove `replace(d)` – jezrael Jun 26 '18 at 13:58
  • @Zesima29 - If column name is different, change `df['col_name'] = df['col_name'].str.strip().map(d)` – jezrael Jun 26 '18 at 14:02
  • Okay where should I add this line of code? `df['col_name'] = df['col_name'].str.strip().map(d)` I was working on exceptions handlers that were supposed to NaN the final product if the column name wasn't in the dictionary. But that line of your will clean all the columns of bad characters. – Zesima29 Jun 26 '18 at 19:16
  • You can add this code before previosu answer from [this](https://stackoverflow.com/a/51025986/2901002) - there is zone column which values need be replaced by dictionary in answer above. Only here is no columns called zone but Name – jezrael Jun 26 '18 at 19:19