OK, so this is a tricky question to describe.
I am putting together a compliance report for a study, whereby the team needs to know if each user performed a particular task on 7 out of 8 days at least once per month.
So I need a way to:
- For a given user and a given month, search the month for an 8 day period where at least 7 out of those 8 days were marked "TRUE"
- Return a 1 is requirement was met, 0 if requirement was not met.
Here is an example of the data structure:
import pandas as pd
ids = 1
req_met = ['TRUE', 'TRUE', 'FALSE', 'FALSE', 'TRUE', 'TRUE', 'TRUE', 'FALSE', 'TRUE', 'TRUE', 'TRUE', 'TRUE', 'FALSE', 'FALSE', 'FALSE', 'FALSE', 'TRUE', 'TRUE', 'TRUE', 'FALSE']
date = ['2018-01-01', '2018-01-03', '2018-01-04', '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08', '2018-01-09', '2018-01-10', '2018-01-11',
'2018-01-12', '2018-01-13', '2018-01-14', '2018-01-15', '2018-01-16', '2018-01-17', '2018-01-18', '2018-01-19', '2018-01-21', '2018-01-23']
df = pd.DataFrame({'id':ids,
'Date':date,
'req_met':req_met})
print(df)
id Date req_met
0 1 2018-01-01 TRUE
1 1 2018-01-03 TRUE
2 1 2018-01-04 FALSE
3 1 2018-01-05 FALSE
4 1 2018-01-06 TRUE
5 1 2018-01-07 TRUE
6 1 2018-01-08 TRUE
7 1 2018-01-09 FALSE
8 1 2018-01-10 TRUE
9 1 2018-01-11 TRUE
10 1 2018-01-12 TRUE
11 1 2018-01-13 TRUE
12 1 2018-01-14 FALSE
13 1 2018-01-15 FALSE
14 1 2018-01-16 FALSE
15 1 2018-01-17 FALSE
16 1 2018-01-18 TRUE
17 1 2018-01-19 TRUE
18 1 2018-01-21 TRUE
19 1 2018-01-23 FALSE
For this user, the answer returned would be '1' because they do have an 8 day period whereby 7 of those days were 'TRUE' (2018-01-06 to 2018-01-13). You can see that the date range is not always consecutive which is an added complication.
My desired output would be the most efficient function that could take this data and return either a '1' (requirement met) or a '0' (requirement not met)
Thanks in advance for any assistance.