0

I have a df like this:

 label                                               data     start
37   1  Ses01M_impro04_F018 [145.2100-153.0500]: We're...  145.21000
38   2  Ses01M_impro04_M019 [148.3800-151.8400]: Well,...  148.38000
39   2                                     M: [BREATHING]  BREATHING
40   1  Ses01M_impro04_M020 [159.7700-161.8600]: I'm n...  159.77000

I parsed out the start column to get the starting timestamp for each row using this code:

df['start'] = df.data.str.split().str[1].str[1:-2].str.split('-').str[0]

I want to convert df.start into floats because they are treated as string right now. However, I can't simply to .astype(float) because of the actual string BREATHING in row 39.

I'd like to just drop the row containing alphabet characters (row 39). I do not know how to do this because at this point, all values in df.start are type string, so I can't filter with something like isnumeric(). How do I do this?

connor449
  • 1,549
  • 2
  • 18
  • 49
  • 1
    [`to_numeric`](https://pandas.pydata.org/docs/reference/api/pandas.to_numeric.html#pandas-to-numeric) `df['start'] = pd.to_numeric(df['start'], errors='coerce')`. Could also wrap `df.data.str.split().str[1].str[1:-2].str.split('-').str[0]` with `to_numeric` to avoid 2 assignments. – Henry Ecker Jul 28 '21 at 17:45
  • 1
    Part 1 of the accepted answer goes through all of the options for how to use `to_numeric`. – Henry Ecker Jul 28 '21 at 17:46

1 Answers1

0

Pasting a skeletal code. You can modify and use it

if a.isnumeric():
  newa=to_numeric(a)
else:
  newa=a