2

I have the following DataFrame df:

df =
date        time   val1
1/17/2018   18:00  20.0
1/17/2018   18:02  21.1
1/17/2018   18:10  23.2
1/17/2018   18:12  22.0
17/1/2018   18:12  22.1
17-Jan-2018 18:12  22.0
1/18/2018   60     22.1
aa          17:30  23.3
17/1/20188  18:00  19.0

The condition to delete rows:

  1. if the format of a field date does not correspond to '%d/%m/%Y'.
  2. if the format of a field time does not correspond to "%H:%M".

Based on these two conditions the last 5 rows in df should be deleted to get a new clean dataframe.

How can I do it? Thanks.

Tatik
  • 1,107
  • 1
  • 9
  • 17

1 Answers1

3

Here is one way to_datetime with errors='coerce' if the format not same as input , it will return NaN

s=pd.to_datetime(df.date+' '+df.time,format='%m/%d/%Y %H:%M',errors='coerce').notna()
df=df[s].copy()
df
Out[212]: 
        date   time  val1
0  1/17/2018  18:00  20.0
1  1/17/2018  18:02  21.1
2  1/17/2018  18:10  23.2
3  1/17/2018  18:12  22.0
BENY
  • 317,841
  • 20
  • 164
  • 234
  • What `df[s].copy()` does? – Tatik Apr 09 '19 at 22:20
  • 1
    @Tatik https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas – BENY Apr 09 '19 at 22:23
  • Also check out the 3rd answer in that link ^ – cs95 Apr 09 '19 at 22:25
  • I got the error `TypeError: data type " " not understood`. I use python 3. – Tatik Apr 09 '19 at 22:30
  • `s=pd.to_datetime(df.date+df.time,format='%m/%d/%Y%H:%M',errors='coerce').notna()` @Tatik – BENY Apr 09 '19 at 22:35
  • Now I get `TypeError: unsupported operand type(s) for +: 'DatetimeIndex' and 'datetime.time'`. I also cannot print variables in this way `print("val"+val)`. It says that `+` cannot be used. I don't understand why it happens. – Tatik Apr 09 '19 at 22:44
  • 1
    `s=pd.to_datetime(df.date.astype(str)+df.time.astype(str),format='%m/%d/%Y%H:%M',errors='coerce').notna()` @Tatik – BENY Apr 09 '19 at 22:46