3

I have a dataset here - https://docs.google.com/spreadsheets/d/e/2PACX-1vSBxKr2RNSNBt1WLWyqnoBKFPI2YvACVqZpqmmdIR5buI2lF3I76Ta3rIoiirHkxin6guH_ywGsPzc1/pub?gid=1429325005&single=true&output=csv

I want to make a feature "warning" as Y if the "date" of that "location" falls between the "risk_from" and "risk_to" dates of that location. If not, "warning" has to be a N.

Like this:

location    case     date       risk_from   risk_to warning
A                   2/1/2021                            N
A         Confirmed 2/2/2021    2/2/2021    2/4/2021    Y
A                   2/3/2021                            Y
A         Confirmed 2/4/2021    2/4/2021    2/6/2021    Y
A                   2/5/2021                            Y
A                   2/6/2021                            Y
A                   2/7/2021                            N
B         Confirmed 2/8/2021    2/8/2021    2/10/2021   Y
B                   2/9/2021                            Y
B                   2/10/2021                           Y
B                   2/11/2021                           N
B                   2/12/2021                           N

I tried to find the range of risk_from and risk_to dates and making them into a set. After that I tried to check each and every current date. If they are in the set then the warning is "Y", if not then the warning is "N". I can find the range between the dates using this answer here but I can't achieve what I want. So looking for expert's advice.

JSVJ
  • 500
  • 3
  • 14

1 Answers1

2

Use parse_dates argument while reading the CSV file. Fill in the missing risk column values and then compare them with the date column.

risk_cols = ['risk_from', 'risk_to']
df = pd.read_csv('Book1 - Book1.csv', parse_dates=['date'] + risk_cols)

df[risk_cols] = df.groupby('location')[risk_cols].apply(lambda x: x.ffill().bfill())

df['warning_out'] = np.where(df.date.between(df.risk_from, df.risk_to), 'Y', 'N')

Output

   location       case       date  risk_from    risk_to warning warning_out
0         A        NaN 2021-02-01 2021-02-02 2021-02-04       N           N
1         A  Confirmed 2021-02-02 2021-02-02 2021-02-04       Y           Y
2         A        NaN 2021-02-03 2021-02-02 2021-02-04       Y           Y
3         A  Confirmed 2021-02-04 2021-02-04 2021-02-06       Y           Y
4         A        NaN 2021-02-05 2021-02-04 2021-02-06       Y           Y
5         A        NaN 2021-02-06 2021-02-04 2021-02-06       Y           Y
6         A        NaN 2021-02-07 2021-02-04 2021-02-06       N           N
7         B  Confirmed 2021-02-08 2021-02-08 2021-02-10       Y           Y
8         B        NaN 2021-02-09 2021-02-08 2021-02-10       Y           Y
9         B        NaN 2021-02-10 2021-02-08 2021-02-10       Y           Y
10        B        NaN 2021-02-11 2021-02-08 2021-02-10       N           N
11        B        NaN 2021-02-12 2021-02-08 2021-02-10       N           N
Vishnudev Krishnadas
  • 10,679
  • 2
  • 23
  • 55