1

I have a dataframe with 6 columns and I wanted to use 5 columns of it (i.e FIRST TRAVEL, SECOND TRAVEL, THIRD TRAVEL) which contains date in it. From these 5 columns, I wanted to consider the greatest date and compare it against the given date "2020-09-25 00:00:00"

Following Condition must be met:

  • If the date is greater than the input date, then it is good. If not we will have to write as "Offer Expired" in a new column named RESULT.

,

  Customer Name        FIRST TRAVEL       SECOND TRAVEL        THIRD TRAVEL       FOURTH TRAVEL        FIFTH TRAVEL         RESULT
0         USER1                 NaT 2020-09-02 08:21:59                 NaT                 NaT                 NaT  Offer Expired
1         USER2                 NaT 2014-11-05 15:23:38                 NaT                 NaT                 NaT  Offer Expired
2         USER3                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
3         USER4                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
4         USER5                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
5         USER6                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
6         USER7                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
7         USER8                 NaT                 NaT                 NaT                 NaT                 NaT            NaN
8         USER9                 NaT 2020-09-02 10:07:11                 NaT                 NaT                 NaT  Offer Expired
9        USER10 2020-03-16 00:00:00                 NaT                 NaT                 NaT                 NaT  Offer Expired
10       USER11 2019-12-11 00:00:00                 NaT                 NaT                 NaT                 NaT  Offer Expired
11       USER12 2020-09-26 00:00:00 2020-04-14 00:00:00                 NaT                 NaT                 NaT            NaN
12       USER13 2020-04-20 00:00:00 2019-10-18 00:00:00                 NaT                 NaT                 NaT  Offer Expired
13       USER14 2020-02-21 00:00:00 2020-04-20 00:00:00                 NaT                 NaT                 NaT  Offer Expired
14       USER15 2020-01-17 00:00:00 2019-10-17 00:00:00                 NaT                 NaT                 NaT  Offer Expired
15       USER16                 NaT 2020-04-20 00:00:00                 NaT                 NaT                 NaT  Offer Expired
16       USER17                 NaT 2019-08-24 00:00:00                 NaT                 NaT                 NaT  Offer Expired
17       USER18                 NaT 2019-11-01 00:00:00                 NaT                 NaT                 NaT  Offer Expired
18       USER19                 NaT 2019-09-13 00:00:00                 NaT                 NaT                 NaT  Offer Expired
19       USER20                 NaT 2020-01-13 00:00:00                 NaT                 NaT                 NaT  Offer Expired
20       USER21                 NaT 2019-09-13 00:00:00                 NaT                 NaT                 NaT  Offer Expired
21       USER22                 NaT 2020-04-20 00:00:00                 NaT                 NaT                 NaT  Offer Expired
22       USER23                 NaT 2020-02-12 00:00:00                 NaT                 NaT                 NaT  Offer Expired
23       USER24                 NaT 2019-10-18 00:00:00                 NaT                 NaT                 NaT  Offer Expired
24       USER25 2020-09-06 22:09:22 2020-04-07 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
25       USER26 2020-09-06 22:09:22 2020-04-21 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
26       USER27                 NaT                 NaT                 NaT 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
27       USER28                 NaT                 NaT                 NaT 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
28       USER29 2020-09-06 22:09:22 2020-04-17 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
29       USER30 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
30       USER31 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
31       USER32 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
32       USER33 2020-09-06 22:09:22                 NaT                 NaT                 NaT 2020-06-03 19:45:36  Offer Expired
33       USER34 2020-09-06 22:09:22 2020-10-27 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36            NaN
34       USER35 2020-09-06 22:09:22 2019-06-18 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
35       USER36 2020-09-06 22:09:22 2020-04-15 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
36       USER37 2020-09-06 22:09:22 2020-09-04 15:29:45 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired
37       USER38 2020-09-06 22:09:22                 NaT                 NaT 2020-09-25 17:03:20 2020-06-03 19:45:36            NaN
38       USER39                 NaT                 NaT                 NaT 2020-09-04 17:03:20 2020-06-03 19:45:36  Offer Expired

Note: This has been simpler in Excel where we can use the below formula. However, I couldn't find a way to do this Pandas.

=IF(COUNTBLANK($B2:$F2)=5,"", IF(MAX($B2:$F2)>$H$1,"","Offer Expired"))

Any help is appreciated.

Gokulnath Kumar
  • 105
  • 1
  • 3
  • 17
  • I would start by applying the pandas melt function to get a new dataframe with just two columns "User" and "Travel Date" see the following [Pandas Melt with Multiple Value Vars](https://stackoverflow.com/questions/45066873/pandas-melt-with-multiple-value-vars) for approach. Then you can sort by dates and group by users and easily find the travel dates for each user and determine if they fall into your time frame. – itprorh66 Nov 25 '20 at 00:55

3 Answers3

2

This will do the trick (comments inline)

import numpy as np
import pandas as pd

# I'm assuming all the relevant columns are already converted:

dt = pd.to_datetime("2020-09-25 00:00:00") 

# you need to indicate somehow the columns to compare - using regex on column names:

dftravels = df.filter(regex=".* TRAVEL$", axis=1)

# NaT and any logical 2 argument operation on it always will evaluate to False
# hence you check only ones where there's not a single date after dt and where there's at least one date overall

df["Result"] = np.where(~dftravels.gt(dt).any(axis=1) & dftravels.any(axis=1), "Offer expired", "")

Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34
2
import pandas as pd

# Date to compare with
my_date = pd.to_datetime("2020-09-25 00:00:00")

# Columns to search in
columns = [
  "FIRST TRAVEL", "SECOND TRAVEL", "THIRD TRAVEL",
  "FOURTH TRAVEL", "FIFTH TRAVEL"
]

# Function to find if the offer expired
def offer_expired(row):
  # Returns True if the offer expires
  date_found = False
  expired = True
  for column in columns:
    # Valid date
    if not pandas.isnull(row[column]):
      date_found = True
      # Expired and date newer than given date
      if row[column] >= my_date and expired:
        expired = False
  return expired and date_found

df["RESULT"] = df.apply(lambda row: "Offer Expired" if offer_expired(row) else "", axis=1)
François B.
  • 1,096
  • 7
  • 19
0

Best way you can translate excel expressions is to use np.where():

given_date = pd.to_datetime("2020-09-25 00:00:00")
columns = df.columns.str.endswith('TRAVEL') # select as per you want you just need a list of columns to go with
df.RESULT = np.where(df.loc[:,columns].max(axis = 1) < given_date, 'Offer Valid', 'Offer Expired')

Also try to read the expression from left to right: where selected columns of df have row-wise max of dates less than given date then offer is valid else offer is expired.

It translates exactly to your question! 1 line, fully functional and highly intuitive! Pandas is fun!

Hamza
  • 5,373
  • 3
  • 28
  • 43