Pandas: Kindly need to find out repeated problem for same customer Note: problem consider repeated if only occurred within 30 days with same code
Asked
Active
Viewed 60 times
-2
-
Does this chain? I.e. is March 1, March 15, April 8 all a repeat, or do we flag the first March 1, label March 15 as a repeat of that event, but then April 8 (which is > 30 days from March 1) is now it's own event – ALollz Jan 15 '21 at 00:17
-
2**[No Screenshots](https://meta.stackoverflow.com/questions/303812/)** of code or data. Always provide a [mre] with code, **data, errors, current output, and expected output**, as **[formatted text](https://stackoverflow.com/help/formatting)**. If relevant, plot images are okay. Please see [How to ask a good question](https://stackoverflow.com/help/how-to-ask). Provide data with [How to provide a reproducible copy of your DataFrame using `df.head(15).to_clipboard(sep=',')`](https://stackoverflow.com/questions/52413246), then **[edit] your question**, and paste the clipboard into a code block. – Trenton McKinney Jan 15 '21 at 00:35
1 Answers
1
Lets try group by Customer ID and Problem code and find the consecutive differences in dates within each group. Convert the time delata into days and check if the resultant absolute value is less than or equal to 30.
However, pay serious attention to comments posted above
df['Date']=pd.to_datetime(df['Date'])# Coerce date to datetime
df[abs(df.groupby(['CT_ID','Problem_code'])['Date'].diff().dt.days).le(30)]
CT_ID Problem_code Date
3 XO1 code_1 2021-01-03 11:35:00
5 XO3 code_4 2020-09-20 09:35:00
8 XO3 code_4 2020-10-10 11:35:00

wwnde
- 26,119
- 6
- 18
- 32