0

Basically I have this pd.series:

0         03/25/93 Total time of visit (in minutes):\n
1                       6/18/85 Primary Care Doctor:\n
2    sshe plans to move as of 7/8/71 In-Home Servic...
3                7 on 9/27/75 Audit C Score Current:\n
4    2/6/96 sleep studyPain Treatment Pain Level (N...

When I try to iterate over it with a loop:

  for i,row in enumerate(df):
    d= row[i].len()

Or this:

    for row in df:
        d= row.len()

I get this error:

AttributeError: 'str' object has no attribute 'len'

Also receive this error message when I try other operations like findall etc.

Hope somebody can enlighten me! Thanks.

user2629628
  • 161
  • 1
  • 3
  • 11
  • d=len(row[i]) ? – David Erickson Jun 12 '20 at 17:46
  • 1
    You can get the length of strings in a Series, with the aptly named [`Sereis.str.len`](https://pandas.pydata.org/docs/reference/api/pandas.Series.str.len.html?highlight=series%20str%20len) – ALollz Jun 12 '20 at 17:50
  • @WiktorStribiżew Not sure, I'm a beginner, but seems as if there is a difference between dataframe and series when it comes to loops – user2629628 Jun 12 '20 at 17:50
  • @ALollz When I try that in the loop I get the same error – user2629628 Jun 12 '20 at 17:51
  • @user2629628 don't use the loop. First you need a Series, let's call it `s`. (that might be what you call `df` above). Then all you do is `s.str.len()` – ALollz Jun 12 '20 at 17:52
  • @user2629628 So you're saying I can't use loops on series? – user2629628 Jun 12 '20 at 17:53
  • Does this answer your question? [How to iterate over rows in a DataFrame in Pandas](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas) – Trenton McKinney Jun 12 '20 at 20:30

1 Answers1

1

You have to use .str to access string functionalities in a Series and you don't need to iterate over every rows.

This will do;

df['str_len'] = df['str_column'].str.len()

By the way in pandas, .findall is .str.contains which returns a boolean indexer.

Use it like this;

substring = 'hello'

df[df['str_column'].str.contains(substring)]
Sy Ker
  • 2,047
  • 1
  • 4
  • 20