-2

Pandas: Kindly need to find out repeated problem for same customer Note: problem consider repeated if only occurred within 30 days with same code

enter image description here

  • Does this chain? I.e. is March 1, March 15, April 8 all a repeat, or do we flag the first March 1, label March 15 as a repeat of that event, but then April 8 (which is > 30 days from March 1) is now it's own event – ALollz Jan 15 '21 at 00:17
  • 2
    **[No Screenshots](https://meta.stackoverflow.com/questions/303812/)** of code or data. Always provide a [mre] with code, **data, errors, current output, and expected output**, as **[formatted text](https://stackoverflow.com/help/formatting)**. If relevant, plot images are okay. Please see [How to ask a good question](https://stackoverflow.com/help/how-to-ask). Provide data with [How to provide a reproducible copy of your DataFrame using `df.head(15).to_clipboard(sep=',')`](https://stackoverflow.com/questions/52413246), then **[edit] your question**, and paste the clipboard into a code block. – Trenton McKinney Jan 15 '21 at 00:35

1 Answers1

1

Lets try group by Customer ID and Problem code and find the consecutive differences in dates within each group. Convert the time delata into days and check if the resultant absolute value is less than or equal to 30.

However, pay serious attention to comments posted above

df['Date']=pd.to_datetime(df['Date'])# Coerce date to datetime

df[abs(df.groupby(['CT_ID','Problem_code'])['Date'].diff().dt.days).le(30)]


    CT_ID     Problem_code                Date
3   XO1       code_1                  2021-01-03 11:35:00
5   XO3       code_4                  2020-09-20 09:35:00
8   XO3       code_4                  2020-10-10 11:35:00
wwnde
  • 26,119
  • 6
  • 18
  • 32