0

I want to drop rows that are before May 1 2018.

First, I changed the source date format as

df['DATE'] = pd.to_datetime(df['DATE'],format='%d/%m/%Y')

How can I specific the cut off date is 2018-04-30?

Third step, I need couple more variables and finally remove them from the raw data

df2 = df[(df['REFERENCE'].str.contains("ABC") & (df['COUNTRY'] == 'countryname') 
& 


<<how can i mention the cut off date and the actual date column that has all dates?) == False]

ANSWER - Thanks for all the comments.

raw_df.drop(raw_df[raw_df['REFERENCE'].str.contains("prefix") & (raw_df['COUNTRY'] == 'countryname') & (raw_df['DATE'].le('2018-05-01'))].index, inplace = True)

or use ge('2018-05-01') if you want greater than
Ziggy
  • 113
  • 6

1 Answers1

2

You can use:

df2 = df[df['DATE'].ge('2018-05-01')]

Example input:

        DATE
0 2018-05-02
1 2018-05-01
2 2018-04-30

Output:

        DATE
0 2018-05-02
1 2018-05-01
mozway
  • 194,879
  • 13
  • 39
  • 75