0

I want to create a new column in Pandas dataset, based on the values for two other columns.

+-----------+----------+
| Column_1  | Column_2 |
+-----------+----------+
| a         | c        |
+-----------+----------+
| b         | d        |
+-----------+----------+

Now, new_column should look like:

+-----------+----------+------------+
| Column_1  | Column_2 | new_column |
+-----------+----------+------------+
| a         | c        | a,c        |
+-----------+----------+------------+
| b         | d        | b,d        |
+-----------+----------+------------+

Any help please?

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
hmd
  • 49
  • 6

2 Answers2

1

I used this one, and it just worked fine:

df['new_column'] = df['Column_1']+ ' , ' +df['Column_2']
sophocles
  • 13,593
  • 3
  • 14
  • 33
hmd
  • 49
  • 6
  • What if you have more than two columns such as `'Column_3': ['d','e']` . I mean you need to explicitly add `' , ' +df['Column_3']` too. E.g. would yield harcoding for each column. – Barbaros Özhan Dec 08 '20 at 19:03
0

You can create an auxiliary dataframe(df_new) while applying concat in order to unpivot unified columns along with the index column generated. Then use apply(lambda x: ','.join(x)) after grouping by new index column such as

import pandas as pd

fields = {'Column_1': ['a','b'],
          'Column_2': ['c','d']
          }

df=pd.DataFrame(fields)
df_new = pd.concat([df[i] for i in df.columns]).reset_index()
df['new_column']=df_new.groupby(['index'])[0].apply(lambda x: ','.join(x)).reset_index()[0]
Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55