return the sum of all characters in a row to another column pandas

Question

Suppose I have this dataframe df:

column1      column2                                            column3
amsterdam    school yeah right backtic escapes sport swimming   2016
rotterdam    nope yeah                                          2012
thehague     i now i can fly no you cannot swimming rope        2010
amsterdam    sport cycling in the winter makes me               2019

How do I get the sum of all characters (exclude white-space) of each row in column2 and return it to new column4 like this:

column1      column2                                            column3    column4
amsterdam    school yeah right backtic escapes sport swimming   2016       70
rotterdam    nope yeah                                          2012       8
thehague     i now i can fly no you cannot swimming rope        2010       65
amsterdam    sport cycling in the winter makes me               2019       55

I tried this code but so far in return I got the sum of all characters of every row in column2:

df['column4'] = sum(list(map(lambda x : sum(len(y) for y in x.split()), df['column2'])))

so currently my df look like this:

column1      column2                                            column3    column4
amsterdam    school yeah right backtic escapes sport swimming   2016          250
rotterdam    nope yeah                                          2012           250
thehague     i now i can fly no you cannot swimming rope        2010           250
amsterdam    sport cycling in the winter makes me               2019           250

anybody have idea?

you might want to change the expected output as it is misleading. Doesn't seem correct — anky, Jan 24 '20 at 07:10

jezrael · Accepted Answer · 2020-01-24T07:11:42.820

3

Use custom lambda function with your solution:

df['column4'] = df['column2'].apply(lambda x: sum(len(y) for y in x.split()))

Or get count of all values and subtract count of whitespaces by Series.str.count:

df['column4'] = df['column2'].str.len().sub(df['column2'].str.count(' '))
#rewritten to custom functon
#df['column4'] = df['column2'].map(lambda x: len(x) - x.count(' '))
print (df)
     column1                                           column2  column3  \
0  amsterdam  school yeah right backtic escapes sport swimming     2016   
1  rotterdam                                         nope yeah     2012   
2   thehague       i now i can fly no you cannot swimming rope     2010   
3  amsterdam              sport cycling in the winter makes me     2019   

   column4  
0       42  
1        8  
2       34  
3       30

edited Jan 24 '20 at 07:11

answered Jan 24 '20 at 07:07

jezrael

822,522
95
1,334
1,252

1

the second one is quite smart :) +1 – anky Jan 24 '20 at 07:09
works like a charm @jezrael. thank you. Any interesting link so that I can read and dive more into lambda in python? thank you again – Jack Zaki Zakiul Fahmi Jailani Jan 24 '20 at 07:11
@JackZakiZakiulFahmiJailani - You can check [this](https://stackoverflow.com/questions/890128/why-are-python-lambdas-useful) – jezrael Jan 24 '20 at 07:14
1

@jezrael you can make it even simpler. check my answer. – Mykola Zotko Jan 24 '20 at 08:04

score 1 · Answer 2 · answered Jan 24 '20 at 07:13

1

Hi This works for me,

import pandas as pd
df=pd.DataFrame({'col1':['Stack Overflow','The Guy']})
df['Count Of Chars']=df['col1'].str.replace(" ","").apply(len)
df

Output

    col1    Count Of characters
0   Stack Overflow  13
1   The Guy          6

answered Jan 24 '20 at 07:13

The Guy

411
4
11

score 1 · Answer 3 · answered Jan 24 '20 at 07:46

1

You can use the method count with a regular expression pattern:

df['column2'].str.count(pat='\w')

Output:

0    42
1     8
2    34
3    30
Name: column2, dtype: int64

answered Jan 24 '20 at 07:46

Mykola Zotko

15,583
3
71
73

return the sum of all characters in a row to another column pandas

3 Answers3