Concatenating data in different columns into a single column (pandas, python)

Question

I am looking for the logic to concatenate the values in many columns with related data from an .xlsx file into a single column using pandas in python. The logic to combine each different column would be different depending on what information the column contains. For example:

input:  
ID,when_carpool,smoking,vehicle,passengers  
0,weekdays,yes,truck,3  
1,weekends,no,sedan,4  
2,weekdays,no,van,6  
3,weekdays,no,van,5  
4,weekends,yes,sedan,3

I have thousands of these rows to process, note that I want to transform the value of the 'smoking' column so it's not a simple concatenation of all columns.

output:  
ID,carpool_info  
0,weekdays+smoking+truck+3  
1,weekends+nonsmoking+sedan+4  
2,weekdays+nonsmoking+van+6  
3,weekdays+nonsmoking+van+5  
4,weekends+smoking+sedan+3

score 2 · Accepted Answer · edited May 23 '17 at 12:32

2

Join all the columns into a new one:

  df["carpool_info"] =  df.apply(lambda x: "+".join([str(x[i]) for i in range(len(x))]),axis=1)

and then drop the other columns you don't need (see also here: Delete column from pandas DataFrame) , or just use the series carpool_Info = df["carpool_info"]

edited May 23 '17 at 12:32

Community

1
1

answered Jun 10 '14 at 21:41

grasshopper

3,988
3
23
29

Ok I tried this and it mostly works. I just need to replace the yes/no in the smoking column with smoking/nonsmoking. Thanks. – ferrios25 Jun 10 '14 at 22:46

Concatenating data in different columns into a single column (pandas, python)

1 Answers1