3

I have a dataframe that looks like:

  col_1
0  A
1  A:C:D
2  A:B:C:D:E
3  B:D

I'm trying to count each ':' to get to:

  col_1        count
0  A             0
1  A:C:D         2
2  A:B:C:D:E     4
3  B:D           1

I've tried applying a function to no avail:

def count_fx(df):
    return df.str.contains(':').sum()
df['count'] = df['col_1'].apply(count_fx)

Also,

df['count'] = df['col_1'].apply(lambda x: (x.str.contains(':')).sum(), axis=1)
DataSwede
  • 5,251
  • 10
  • 40
  • 66

2 Answers2

3

You need to use the str.count method:

df['count'] = df[0].apply(lambda x: x.count(':'))

           0  count
0      A:C:D      2
1  A:B:C:D:E      4
2        B:D      1
Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
2

The way you use apply() here causes each element to be passed in. I.e., the DataFrame is not what is passed in. So you just need to change your function to

def count_fx(s):
    return s.count(':')

Then you can use what you had:

In [11]: df['count'] = df['col_1'].apply(count_fx)

In [12]: df
Out[12]:
       col_1  count
0          A      0
1      A:C:D      2
2  A:B:C:D:E      4
3        B:D      1
Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
chrisaycock
  • 36,470
  • 14
  • 88
  • 125