Pandas how to get rows with consecutive dates and sales more than 1000?

Question

I have a data frame called df:

Date        Sales
01/01/2020    812
02/01/2020    981
03/01/2020    923
04/01/2020   1033
05/01/2020    988
...           ...

How can I get the first occurrence of 7 consecutive days with sales above 1000?

This is what I am doing to find the rows where sales is above 1000:

In  [221]:  df.loc[df["sales"] >= 1000]
Out [221]: 
Date        Sales
04/01/2020   1033
08/01/2020   1008
09/01/2020   1091
17/01/2020   1080
18/01/2020   1121
19/01/2020   1098
...           ...

Cainã Max Couto-Silva · Accepted Answer · 2020-12-04T17:56:30.673

5

You can assign a unique identifier per consecutive days, group by them, and return the first value per group (with a previous filter of values > 1000):

df = df.query('Sales > 1000').copy()
df['grp_date'] = df.Date.diff().dt.days.fillna(1).ne(1).cumsum()
df.groupby('grp_date').head(7).reset_index(drop=True)

where you can change the value of head parameter to the first n rows from consecutive days.

Note: you may need to use pd.to_datetime(df.Date, format='%d/%m/%Y') to convert dates from strings to pandas datetime, and sort them.

edited Dec 04 '20 at 17:56

answered Dec 04 '20 at 16:11

Cainã Max Couto-Silva

4,839
1
11
35

Nevermind, it's working. Updated to get all columns in output, and drop the temporary one. Best! – Cainã Max Couto-Silva Dec 04 '20 at 17:10

score 0 · Answer 2 · answered Dec 04 '20 at 17:34

0

Couldn't you just sort by date and grab head 7?

df = df.sort_values('Date')
df.loc[df["sales"] >= 1000].head(7)

If you need the original maybe make a copy first

answered Dec 04 '20 at 17:34

fthomson

773
3
9

Pandas how to get rows with consecutive dates and sales more than 1000?

2 Answers2

Linked