-1
 LastUpdate            TS_UPDATE
0  2020-02-02  2019-10-30 15:27:20
6  2020-02-02  2019-10-30 15:27:14
8  2020-02-02  2019-10-30 15:27:07
9  2020-02-02  2019-10-30 15:27:07
10 2020-02-02  2019-10-30 15:27:07
11 2020-02-02  2019-10-30 15:27:05
12 2020-02-02  2019-10-30 15:27:04
13 2020-02-02  2019-10-30 15:27:03
14 2020-02-02  2019-10-30 15:27:03
15 2020-02-02  2019-10-30 15:27:02

How can I check if the LastUpdate is newer than TS_UPDATE? OR probably better: if TS_UPDATE is OLDER than LastUpdate if that isnt the case -> drop row.

Can anyone help me out?

Is it possible to do by boolean (> / <)?

 for row in df:
    if df["LastUpdate"][row] < df["TS_UPDATE"][row]:
        #drop row
yannickhau
  • 385
  • 1
  • 13

3 Answers3

1

The idiomatic way is to just keep the relevant rows:

resul = df[df["LastUpdate"] >= df["TS_UPDATE"]]
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
0

You may use query method:

df.query("LastUpdate>=TS_UPDATE")

Very compact, SQL-like syntax.

If You insist on iteration:

for i, r in df.iterrows():
    if r.LastUpdate < r.TS_UPDATE:
        df.drop(i, inplace = True)

ipj
  • 3,488
  • 1
  • 14
  • 18
0

If the data is string, you would get an answer for the > or < comparison and here's how it works. If all the data in your dataframe is formatted in the way you have shown, the comparison should work. Instead of the for loop you can just do

new_df = df[df["LastUpdate"] < df["TS_UPDATE"]]

However, if you really want to compare the numbers, you can use pandas' pd.to_datetime function. This will transform the strings to datetime format which supports many numerical operations

df['LUdt'] = pd.to_datetime(df['LastUpdate'])
df['TSdt'] = pd.to_datetime(df['TS_UPDATE'])
new_df = df[df['LUdt'] > df['TUdt']] 
Teshan Shanuka J
  • 1,448
  • 2
  • 17
  • 31