1

I am trying to use a lambda function to create several new columns at once in pandas. I know this works if the function returns a single value but if multiple values are returned it cannot unpack properly. I have created a small reproducible example to illustrate.

I have a small DataFrame of 6 numbers in the column number:

lst = [1,2,3,4,5,6]
df = pd.DataFrame(lst, columns=['number'])

    df
   number
0       1
1       2
2       3
3       4
4       5
5       6

If I use this custom function that returns only a single value within a lambda, I am able to create a new column with ease:

def add_one(number):
    new_number = number + 1
    return new_number

df['add_one'] = df.number.apply(lambda x: add_one(x))

   number  add_one
0       1        2
1       2        3
2       3        4
3       4        5
4       5        6
5       6        7

However if I use a slightly more complex function that returns more than one value it does not know how to unpack this correctly:

def multiply_bylist(number, multiple, string):
    num = number * multiple
    word = string + str(number)
    return num, word

df['multiple'], df['words'] = df.number.apply(lambda x: multiply_bylist(x, 5, 'This is number:'))


ValueError: too many values to unpack (expected 2)

I was hoping for the final DataFrame to look as such:

   number  add_one  multiple             words
0       1        2         5  'This is number:1'
1       2        3        10  'This is number:2'
2       3        4        15  'This is number:3'
3       4        5        20  'This is number:4'
4       5        6        25  'This is number:5'
5       6        7        30  'This is number:6'

Is this possible or not? Thanks

gerardcslabs
  • 213
  • 1
  • 3
  • 9
  • 1
    Does this answer your question? [how to create multiple columns at once with apply?](https://stackoverflow.com/questions/66267729/how-to-create-multiple-columns-at-once-with-apply) – dm2 Apr 08 '21 at 15:02
  • 2
    You _really_ should avoid apply(axis=1) as much as you can. It undermines the majority of the pandas performance. But if you do have some super complicated aggregation that requires multple returns you can return a Series, and then `concat` it to the DataFrame is my preferred – ALollz Apr 08 '21 at 15:07

1 Answers1

0
df[['multiple','words']] = df.number.apply(lambda x: multiply_bylist(x, 5, 'This is number:')).tolist()

should return

    number  multiple    words
0   1   5   This is number:1
1   2   10  This is number:2
2   3   15  This is number:3
3   4   20  This is number:4
4   5   25  This is number:5
5   6   30  This is number:6
Ronak Agrawal
  • 1,006
  • 1
  • 19
  • 48