How to cat two column (float) into a column quick and efficiency in pandas dataframe?

Question

I want to get a new column by cat two column (float or int) as following shows,

So anyone have a better idea?

I think mine is something too complex

a=pandas.Series([1,3,5,7,9])
b=pandas.Series([2,4,6,8,10])
c=pandas.Series([3,5,6,5,10])

abc=pandas.DataFrame({'a':a, 'b':b, 'c':c})

abc
   a   b   c
0  1   2   3
1  3   4   5
2  5   6   6
3  7   8   5
4  9  10  10

abc['new']=pandas.Series(map(str,abc.iloc[:,0])).str.cat(pandas.Series(map(str,abc.iloc[:,1])), sep='::')

abc
   a   b   c    new
0  1   2   3   1::2
1  3   4   5   3::4
2  5   6   6   5::6
3  7   8   5   7::8
4  9  10  10  9::10

score 3 · Answer 1 · answered Jul 06 '17 at 07:43

Use astype for convert to str:

#if need select columns by position with iloc
abc['new'] = abc.iloc[:,0].astype(str) + '::' + abc.iloc[:,1].astype(str)
print (abc)
   a   b   c    new
0  1   2   3   1::2
1  3   4   5   3::4
2  5   6   6   5::6
3  7   8   5   7::8
4  9  10  10  9::10

#if need select by column names
abc['new'] = abc['a'].astype(str) + '::' + abc['b'].astype(str)
print (abc)
   a   b   c    new
0  1   2   3   1::2
1  3   4   5   3::4
2  5   6   6   5::6
3  7   8   5   7::8
4  9  10  10  9::10

Solution with str.cat:

abc['new'] = abc['a'].astype(str).str.cat(abc['b'].astype(str), sep='::')
print (abc)
   a   b   c    new
0  1   2   3   1::2
1  3   4   5   3::4
2  5   6   6   5::6
3  7   8   5   7::8
4  9  10  10  9::10

Rayhane Mama · Answer 2 · 2017-07-06T08:06:33.543

3

You can also do something like this using map

abc['d'] = abc['a'].map(str) +'::'+ abc['b'].map(str)
print(abc)

output:

   a   b   c      d
0  1   2   3   1::2
1  3   4   5   3::4
2  5   6   6   5::6
3  7   8   5   7::8
4  9  10  10  9::10

edited Jul 06 '17 at 08:06

answered Jul 06 '17 at 07:46

Rayhane Mama

2,374
11
20

Dimgold · Answer 3 · 2017-07-06T07:51:09.330

1

how about using apply?

abc['new'] = abc.apply(lambda x: '{}::{}'.format(x['a'],x['b']), axis=1)

it is a simple one-liner this way.

edited Jul 06 '17 at 07:51

answered Jul 06 '17 at 07:45

Dimgold

2,748
5
26
49

thanks for you answer, but Rayhane Mama may be the most simple solution. – cwind Jul 06 '17 at 07:58
yep but note that it makes 3 iterations over the data (``map``, ``map`` and assignment) – Dimgold Jul 06 '17 at 08:04
It is very slow, because processes by rows. `astype` and `map` and `sum` are faster, because vectorized functions. – jezrael Jul 06 '17 at 08:08
@jezrael which one is slow? As far as I know pandas don't multiprocess – Dimgold Jul 06 '17 at 08:09
1

apply is slow, maybe help [this](https://stackoverflow.com/questions/24870953/does-iterrows-have-performance-issues/24871316#24871316) - `Jeff` is now developer of pandas. – jezrael Jul 06 '17 at 08:11
Thanks, I was always sure that ``apply`` is some-how a vectorized variation of iterations. – Dimgold Jul 06 '17 at 08:13
yeah, you idea is good. so how to make this efficiency? abc.iloc[:,i].map(str).str.strip('something'), something i am not showed here... – cwind Jul 06 '17 at 08:14
@jezrael but isn't ``map`` is equivalent to ``apply``? – Dimgold Jul 06 '17 at 08:27
Hard question for me, but I think map is simplier so faster. But apply is more complicated. Also If use `.apply(axis=1)` it is processes by rows and it is slow too. And for your solution `map` cannot be used, because works only with one column (`Series`) – jezrael Jul 06 '17 at 08:28
And also is possible see perfect maxu answer with comparing solutiions - https://stackoverflow.com/a/36911306/2901002 – jezrael Jul 06 '17 at 08:51

How to cat two column (float) into a column quick and efficiency in pandas dataframe?

3 Answers3