0

I have the dataframe like the following,

  Travel Date
0  2020-09-23
1  2020-09-24
2  2020-09-30
3         NaT
4  2015-10-15
5  2018-07-30
6         NaT
7  2019-09-25
8  2018-06-05

And I wanted to check whether some custom given date is greater than the data present in the 'Travel Date' column and write the result in a new column as Passed but I wanted to ignore the column containing NaT.

But at the moment, it is taking the column with NaT also for the comparison and writing the result as Passed.

  Travel Date  Detail
0  2020-09-23  Passed
1  2020-09-24  Passed
2  2020-09-30     NaN
3         NaT  Passed
4  2015-10-15  Passed
5  2018-07-30  Passed
6         NaT  Passed
7  2019-09-25  Passed
8  2018-06-05  Passed

Tried the following code, but it is including the column NaT also for the comparison and writing as Passed.

df1['Travel Date']= pd.to_datetime(df1['Travel Date'])
test = df1['Travel Date'] > '2020-09-29  12:00:00'
df1.loc[~test, "Detail"] = "Passed"
mck
  • 40,932
  • 13
  • 35
  • 50
Gokulnath Kumar
  • 105
  • 1
  • 3
  • 17

2 Answers2

2

Comparison with NaN always results in False. So you can just revert your condition:

df['Detail'] = np.where(df['Travel Date'] <= '2020-09-29  12:00:00', 'Passed', np.nan)

Or similarly:

df.loc[df['Travel Date'] <= '2020-09-29  12:00:00', 'Detail'] = 'Passed'

Output:

  Travel Date  Detail
0  2020-09-23  Passed
1  2020-09-24  Passed
2  2020-09-30     nan
3         NaT     nan
4  2015-10-15  Passed
5  2018-07-30  Passed
6         NaT     nan
7  2019-09-25  Passed
8  2018-06-05  Passed
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • Can you also help me in telling, how we can perform a similar comparison with more dates in different columns. [Here is my query](https://stackoverflow.com/questions/64994520/comparing-date-with-multiple-columns-in-pandas) – Gokulnath Kumar Nov 25 '20 at 13:59
1

You can add a check for NaT:

df1['Travel Date']= pd.to_datetime(df1['Travel Date'])
test = (df1['Travel Date'] > '2020-09-29  12:00:00') & (~pd.isna(df1['Travel Date']))
df1.loc[~test, "Detail"] = "Passed"
mck
  • 40,932
  • 13
  • 35
  • 50