4

I am trying to calculate multiple colums from multiple columns in a pandas dataframe using a function. The function takes three arguments -a-, -b-, and -c- and and returns three calculated values -sum-, -prod- and -quot-. In my pandas data frame I have three coumns -a-, -b- and and -c- from which I want to calculate the columns -sum-, -prod- and -quot-.

The mapping that I do works only when I have exactly three rows. I do not know what is going wrong, although I expect that it has to do something with selecting the correct axis. Could someone explain what is happening and how I can calculate the values that I would like to have. Below are the situations that I have tested.

INITIAL VALUES

def sum_prod_quot(a,b,c):
    sum  = a + b + c
    prod = a * b * c
    quot = a / b / c
    return (sum, prod, quot)

df = pd.DataFrame({ 'a': [20, 100, 18],
                    'b': [ 5,  10,  3],
                    'c': [ 2,  10,  6],
                    'd': [ 1,   2,  3]
                 })

df
    a   b   c  d
0   20   5   2  1
1  100  10  10  2
2   18   3   6  3

CALCULATION STEPS

Using exactly three rows

When I calculate three columns from this dataframe and using the function function I get:

df['sum'], df['prod'], df['quot'] = \
        list( map(sum_prod_quot, df['a'], df['b'], df['c']))

df
     a   b   c  d    sum     prod   quot
0   20   5   2  1   27.0    120.0   27.0
1  100  10  10  2  200.0  10000.0  324.0
2   18   3   6  3    2.0      1.0    1.0

This is exactly the result that I want to have: The sum-column has the sum of the elements in the columns a,b,c; the prod-column has the product of the elements in the columns a,b,c and the quot-column has the quotients of the elements in the columns a,b,c.

Using more than three rows

When I expand the dataframe with one row, I get an error!

The data frame is defined as:

df = pd.DataFrame({ 'a': [20, 100, 18, 40],
                    'b': [ 5,  10,  3, 10],
                    'c': [ 2,  10,  6,  4],
                    'd': [ 1,   2,  3,  4]
                 })
df
     a   b   c  d
0   20   5   2  1
1  100  10  10  2
2   18   3   6  3
3   40  10   4  4

The call is

df['sum'], df['prod'], df['quot'] = \
        list( map(sum_prod_quot, df['a'], df['b'], df['c']))

The result is

...
    list( map(sum_prod_quot, df['a'], df['b'], df['c']))
ValueError: too many values to unpack (expected 3) 

while I would expect an extra row:

df
     a   b   c  d    sum     prod   quot
0   20   5   2  1   27.0    120.0   27.0
1  100  10  10  2  200.0  10000.0  324.0
2   18   3   6  3    2.0      1.0    1.0
3   40  10   4  4   54.0   1600.0    1.0

Using less than three rows

When I reduce tthe dataframe with one row I get also an error. The dataframe is defined as:

df = pd.DataFrame({ 'a': [20, 100],
                    'b': [ 5,  10],
                    'c': [ 2,  10],
                    'd': [ 1,   2]
                 })
df
     a   b   c  d
0   20   5   2  1
1  100  10  10  2

The call is

df['sum'], df['prod'], df['quot'] = \
        list( map(sum_prod_quot, df['a'], df['b'], df['c']))

The result is

...
    list( map(sum_prod_quot, df['a'], df['b'], df['c']))
ValueError: need more than 2 values to unpack

while I would expect a row less:

df
     a   b   c  d    sum     prod   quot
0   20   5   2  1   27.0    120.0   27.0
1  100  10  10  2  200.0  10000.0  324.0

QUESTIONS

The questions I have:

1) Why do I get these errors?

2) How do I have to modify the call such that I get the desired data frame?

NOTE

In this link a similar question is asked, but the given answer did not work for me.

PeterDev
  • 53
  • 1
  • 4
  • Are you sure your answer is correct? Can you check product column as well? – niraj Jul 08 '17 at 19:11
  • There is an easier way to do this `df['sum'] = df['a'] + df['b'] + df['c']` and the same for the other operations. Unless this is just an example for the error – DJK Jul 08 '17 at 19:14
  • Thanks for the suggestion. The above is an example to show the error. The original function is a function in another module. The solution provided by open-source does the trick. – PeterDev Jul 08 '17 at 22:50

1 Answers1

7

The answer doesn't seem correct for 3 rows as well. Can you check other values except first row and first column. Looking at the results, product of 20*5*2 is NOT 120, it's 200 and is placed below in sum column. You need to form list in correct way before assigning to new columns. You can try use following to set the new columns:

df['sum'], df['prod'], df['quot'] = zip(*map(sum_prod_quot, df['a'], df['b'], df['c']))

For details follow the link

niraj
  • 17,498
  • 4
  • 33
  • 48
  • 2
    This is exactly the line I was looking for. Thanks. It also works with the original function (which is placed in another module) where I need to calculate polar coordinates from rectangular coordinates. – PeterDev Jul 08 '17 at 22:50
  • @PeterDev Great! if you would like you can accept the answer. `Happy Coding`. – niraj Jul 08 '17 at 22:52
  • What does the unzipping and rezipping in zip(*map(...)) do? Seems like redundant work. – Pedro Contipelli Jun 02 '23 at 18:17