Creating a new column in Pandas based on the values of two other columns

Question

I want to create a new column in Pandas dataset, based on the values for two other columns.

+-----------+----------+
| Column_1  | Column_2 |
+-----------+----------+
| a         | c        |
+-----------+----------+
| b         | d        |
+-----------+----------+

Now, new_column should look like:

+-----------+----------+------------+
| Column_1  | Column_2 | new_column |
+-----------+----------+------------+
| a         | c        | a,c        |
+-----------+----------+------------+
| b         | d        | b,d        |
+-----------+----------+------------+

Any help please?

score 1 · Answer 1 · edited Dec 08 '20 at 18:23

1

I used this one, and it just worked fine:

df['new_column'] = df['Column_1']+ ' , ' +df['Column_2']

edited Dec 08 '20 at 18:23

sophocles

13,593
3
14
33

answered Dec 07 '20 at 23:59

hmd

49
6

What if you have more than two columns such as `'Column_3': ['d','e']` . I mean you need to explicitly add `' , ' +df['Column_3']` too. E.g. would yield harcoding for each column. – Barbaros Özhan Dec 08 '20 at 19:03

Barbaros Özhan · Answer 2 · 2020-12-08T19:01:03.740

0

You can create an auxiliary dataframe(df_new) while applying concat in order to unpivot unified columns along with the index column generated. Then use apply(lambda x: ','.join(x)) after grouping by new index column such as

import pandas as pd

fields = {'Column_1': ['a','b'],
          'Column_2': ['c','d']
          }

df=pd.DataFrame(fields)
df_new = pd.concat([df[i] for i in df.columns]).reset_index()
df['new_column']=df_new.groupby(['index'])[0].apply(lambda x: ','.join(x)).reset_index()[0]

edited Dec 08 '20 at 19:01

answered Dec 07 '20 at 23:21

Barbaros Özhan

59,113
10
31
55

I think groupby is not needed – ansev Dec 07 '20 at 23:30

Creating a new column in Pandas based on the values of two other columns

2 Answers2