Pandas: Find Users with 2 Interactions In 4 Days

Question

I have a Pandas DataFrame, df, with the following columns:

user_id, date_of_visit

I want to find all user_ids who visited at least twice in any 4 day window. So if I had:

user_id, date_of_visit

1, 1/4/15

1, 1/6/15

2, 1/4/15

2, 1/12/15

2, 1/23/15

Then my function would return [1]

I could just go through each row, but I'm wondering if there's a better way to leverage Pandas. Maybe something with:

df.groupby('user_id')...?

Thanks!

Can you make your example [reproducible](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)? — cmaher, Apr 24 '18 at 16:27

score 1 · Accepted Answer · answered Apr 24 '18 at 16:31

1

You can do with rolling within filter

#df.date_of_visit=pd.to_datetime(df.date_of_visit)

df.groupby('user_id').filter(lambda x : x.set_index('date_of_visit').rolling('4d').count().gt(1).any()).user_id.unique()
Out[525]: array([1], dtype=int64)

answered Apr 24 '18 at 16:31

BENY

317,841
20
164
234

Thanks for the help. I used your approach but get `TypeError: filter function returned a Series, but expected a scalar bool` . Happy to provide a reproducible example if helpful – anon_swe Apr 24 '18 at 19:35

Pandas: Find Users with 2 Interactions In 4 Days

1 Answers1