I am trying to follow the otherwise excellent solution provided in the thread pandas-counting-and-summing-specific-conditions, but the code only ever outputs nan values, and with sum (not count), gives a future warning.
Basically, for each row in my df, I want to count how many dates in a column are within a range of +/- 1 days of other dates in the same column.
If I were doing it in excel, the following muti-conditional sumproduct or countifs are possible:
= SUMPRODUCT(--(AN2>=$AN$2:$AN$35000-1),--(AN2<=$AN$2:$AN$35000+1)),
or
=countifs($AN$2:$AN$35000,">="&AN2-1,$AN$2:$AN$35000,"<="&AN2+1)
In python, trying the approach in the linked thread, I believe the code would be:
import pandas as pd
import datetime
df = pd.DataFrame({'datet': [pd.to_datetime("2020-03-04 00:00:00"), pd.to_datetime("2020-03-05 00:00:00"),\
pd.to_datetime("2020-03-09 00:00:00"), pd.to_datetime("2020-03-10 00:00:00"),\
pd.to_datetime("2020-03-11 00:00:00"), pd.to_datetime("2020-03-12 00:00:00")]})
df["caseIntensity"] = df[(df['datet'] <= df['datet'] + datetime.timedelta(days=1)) &\
(df['datet'] >= df['datet'] - datetime.timedelta(days=1))].sum()
The output should be: 2, 2, 2, 3, 3, 2. Instead it is wholemeal nan!
Is it correct to assume that because I'm testing conditions, it doesn't matter if I sum or count? If I need to sum, I get a future warning about invalid columns (the columns are valid), which I don't understand. But mostly, my question is why am I only getting nan?