Concatenate 2 Pandas list-filled columns into 1 big list?

Question

I have a Pandas DataFrame that looks like this:

     NAME      total           total_temp
ID                                      
1     CVS     [abc1]       [cba, xyzzy01]
2  Costco     [bcd2, 22]   [dcb, xyzzy02]
3   Apple     [cde3]       [edc, xyzzy03]

I want to add create a new column total_temp_2 so that the data looks like this:

     NAME      total       total_temp                   total_temp_2
ID                                                  
1     CVS     [abc1]       [cba, xyzzy01]       [abc1, cba, xyzzy01]
2  Costco     [bcd2, 22]   [dcb, xyzzy02]   [bcd2, 22, dcb, xyzzy02]
3   Apple     [cde3]       [edc, xyzzy03]       [cde3, edc, xyzzy03]

I feel like I could guess my way through really inefficient ways to concatenate the lists, but I suspect I'm missing something I don't know about Pandas.

How can i achieve this operation using pandas?

I can't see why your solution `df['total'] + df['total_temp']` isn't working. If those are lists it should combine them together — yatu, Feb 11 '19 at 16:13
Should not have issue . even it is str , it should not have errors — BENY, Feb 11 '19 at 16:19
All right ... I'm an idiot ... I forgot that for some testing reason or another, I had putzed around with my data, and that I was actually dealing with the string `'[dcb, xyzzy02]'`, not the list `[dcb, xyzzy02]`. Simple answer works now that both columns are full of proper lists. — k.., Feb 11 '19 at 16:26

cs95 · Accepted Answer · 2019-02-11T16:38:13.390

When dealing with mixed types, I usually recommend using something like a list comprehension which has minimal memory and performance overhead.

df['total_temp_2'] = [x + y for x, y in zip(df['total'], df['total_temp'])]
df

      NAME       total      total_temp              total_temp_2
ID                                                              
1      CVS      [abc1]  [cba, xyzzy01]      [abc1, cba, xyzzy01]
2   Costco  [bcd2, 22]  [dcb, xyzzy02]  [bcd2, 22, dcb, xyzzy02]
3    Apple      [cde3]  [edc, xyzzy03]      [cde3, edc, xyzzy03]

If these are columns of strings, you can use ast.literal_eval to parse them:

import ast

c = df.select_dtypes(include=[object]).columns
df[c] = df[c].applymap(ast.literal_eval)

If the solution above throws ValueError: malformed node or string:, try using the yaml package instead.

import yaml
df = df.applymap(yaml.load)

Funnily enough, simple addition works for me on 0.24.

df['total'] + df['total_temp']

ID
1        [abc1, cba, xyzzy01]
2    [bcd2, 22, dcb, xyzzy02]
3        [cde3, edc, xyzzy03]
dtype: object

These also work,

df['total'].add(df['total_temp'])

ID
1        [abc1, cba, xyzzy01]
2    [bcd2, 22, dcb, xyzzy02]
3        [cde3, edc, xyzzy03]
dtype: object

df['total_temp'].radd(df['total'])

ID
1        [abc1, cba, xyzzy01]
2    [bcd2, 22, dcb, xyzzy02]
3        [cde3, edc, xyzzy03]
dtype: object

These are great in terms of simplicity, but inherently loopy since mixed type operations are harder to vectorize.

All right ... I'm an idiot ... I forgot that for some testing reason or another, I had putzed around with my data, and that I was actually dealing with the string `'[dcb, xyzzy02]'`, not the list `[dcb, xyzzy02]`. Simple answer works now that both columns are full of proper lists. Thanks for the alternatives in case I ever get into a pinch, though! — k.., Feb 11 '19 at 16:27
@k.. I've edited your question so your mistake is less obvious now :-) — cs95, Feb 11 '19 at 16:28

score 1 · Answer 2 · answered Feb 11 '19 at 16:26

In a situation like this (wanting to apply a function to a dataframe I usually go to .apply(). So I would run this:

df['total_temp_2'] = df.apply(lambda x: x['total'] + x['total_temp'], axis=1)

Using the built-in pandas functionality is optimal for this type of transformation.

Concatenate 2 Pandas list-filled columns into 1 big list?

2 Answers2