2

Example data in python 3.5:

import pandas as pd
df=pd.DataFrame({"A":["x","y","z","t","f"],
                "B":[1,2,1,2,4]})

This gives me a dataframe with 2 columns "A" and "B". I then want to add a third column "C" that contains the value of "A" and "B" concatenated and separated by "_".
Following the suggestion from this answer I can do it like this.

for i in range(0,len(df["A"])):
    df.loc[i,"C"]=df.loc[i,"A"]+"_"+str(df.loc[i,"B"]) 

I get the result I want but it seems convoluted for such a simple task.

In R this would be done like this:

df<-data.frame(A=c("x","y","z","t","f"),
               B=c(1,2,1,2,4))
df$C<-paste(df$A,df$B,sep="_")

Another thread suggested the use of the "%" operator but I can't get it to work.

Is there a better alternative?

Community
  • 1
  • 1
Haboryme
  • 4,611
  • 2
  • 18
  • 21

1 Answers1

2

You can just add the columns together but for 'B' you need to cast the type using astype(str):

In [115]:
df['C'] = df['A'] + '_' + df['B'].astype(str)
df

Out[115]:
   A  B    C
0  x  1  x_1
1  y  2  y_2
2  z  1  z_1
3  t  2  t_2
4  f  4  f_4

This is a vectorised approach and will scale much better than looping over every row for large dfs

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • My attempts included `df["C"]=df["A]+"_"+str(df["B"])` which doesn't work. Thanks for the ".astype(str)" it solves my problem. – Haboryme Sep 09 '16 at 09:26
  • 2
    `str(df['B'])` just makes a string `repr` of the series so it doesn't change the type – EdChum Sep 09 '16 at 09:27