connecting strings in pandas

Question

I try to sum up columns with string data. The Problem is that I want to ignore the NaN, but I didn't find a solution.

The Dataframe look like this:

s=pd.DataFrame({'A':['(Text,','(Text1,'],'B':['(Text2,','(Text3,'],'C':['(Text4,','(Text5,']})


        A        B        C
0   (Text,  (Text2,  (Text4,
1  (Text1,  (Text3,  (Text5,

First I delete the brackets and commas with:

sA = s['A'].str.lstrip('(').str.rstrip(',')
sB = s['B'].str.lstrip('(').str.rstrip(',')
sC = s['C'].str.lstrip('(').str.rstrip(',')

And then I put the columns together.

sNew = sA + ' ' +  sB + ' ' + sC

print sNew
0   Text Text2 Text4
1  Text1 Text3 Text5

1. Is there a better way to sum up the columns? I have the feeling that this way isn't really efficient. I tried the str.lstrip for all columns but it doesn't work.

2. If I have a NaN in a Cell, the row will be NaN. How can I ignore the NaN in this spezific case? e.g.

    A        B        C
0   (Text,  (Text2,  (Text4,
1  (Text1,  (Text3,  NaN

and my result is after delete the brackets and sum up...

0   Text Text2 Text4
1   NaN

but I want the following result...

0   Text Text2 Text4
1  Text1 Text3

It will be great if you have some tips for me to solve the problem!

score 0 · Answer 1 · answered Jun 29 '16 at 11:33

0

You can fill the null values of your dataframe with empty strings before computing the new column. Use fillna like this:

s.fillna('',inplace = True)

answered Jun 29 '16 at 11:33

ysearka

3,805
5
20
41

score 0 · Accepted Answer · edited May 23 '17 at 10:30

I think you can use Kiwi solution, where is added removing (, by .strip('(,'):

import pandas as pd
import numpy as np

s=pd.DataFrame({'A':['(Text,','(Text1,'],
                'B':[np.nan,'(Text3,'],
                'C':['(Text4,',np.nan]})
print(s)

         A        B        C
0   (Text,      NaN  (Text4,
1  (Text1,  (Text3,      NaN

def concat(*args):
    strs = [str(arg).strip('(,') for arg in args if not pd.isnull(arg)]
    return ','.join(strs) if strs else np.nan
np_concat = np.vectorize(concat)

s['new'] = np_concat(s.A, s.B, s.C)
print (s)
         A        B        C          new
0   (Text,      NaN  (Text4,   Text,Text4
1  (Text1,  (Text3,      NaN  Text1,Text3

Thats what I need. Thanks! – EnergyNet Jun 30 '16 at 12:17 — EnergyNet, Jun 30 '16 at 12:17

connecting strings in pandas

2 Answers2