I have a table of sales; Account,product code,value,pcs etc.
I want to quickly calculate the number of maximum consecutive months each unique account number has ordered once or more.
I do have a method but its slow as it iterates through each row, is there a better method?
Here is my data:
first_purchase_df.head()
+-----------+------------+----------+------------+--------+---------+------------+-----------------------------+------------+------+-------+---------+----------+--------+-------------------+----------+
| SA_DACCNT | SA_TRDATE | SA_TRREF | SA_TRVALUE | SA_QTY | SA_COST | SA_PRODUCT | SA_DESC | Month_year | Year | Month | CN_REF | CN_CATAG | Margin | FirstPurchaseDate | UserType |
+-----------+------------+----------+------------+--------+---------+------------+-----------------------------+------------+------+-------+---------+----------+--------+-------------------+----------+
| GO63 | 2009-01-02 | 11587 | 0.980 | 1.000 | 0.580 | R613S/6 | BLACK BUFFALO GRAIN R613S/6 | 01-2009 | 2009 | 1 | R613S/6 | ZZZZ | 0.400 | 01-2009 | New |
| GO63 | 2009-01-02 | 11587 | 2.500 | 1.000 | 0.000 | POST3 | POSTAGE | 01-2009 | 2009 | 1 | POST3 | POST | 2.500 | 01-2009 | New |
| GO63 | 2009-01-02 | 11587 | 2.500 | 1.000 | 0.000 | POST3 | POSTAGE | 01-2009 | 2009 | 1 | POST3 | POST | 2.500 | 01-2009 | New |
+-----------+------------+----------+------------+--------+---------+------------+-----------------------------+------------+------+-------+---------+----------+--------+-------------------+----------+
I have groupped by month-yeardate:
retention = first_purchase_df.groupby(['Month_year','SA_DACCNT'])['Margin'].sum().astype(int).reset_index()
retention.head()
+------------+-----------+--------+
| Month_year | SA_DACCNT | Margin |
+------------+-----------+--------+
| 01-2009 | ABB1 | 199 |
| 01-2009 | ABB3 | 75 |
| 01-2009 | ACK1 | 49 |
| 01-2009 | ACR2 | 79 |
| 01-2009 | ADO1 | 210 |
+------------+-----------+--------+
I then used cross tab to put into months with a binary classifier if they have ordered or not.
+------------+-----------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
| Month_year | SA_DACCNT | 01-2009 | 01-2010 | 01-2011 | 01-2012 | 01-2013 | 01-2014 | 01-2015 | 01-2016 | 01-2017 | 01-2018 | 01-2019 | 02-2009 | 02-2010 | 02-2011 | 02-2012 | 02-2013 | 02-2014 | 02-2015 | 02-2016 | 02-2017 | 02-2018 | 02-2019 | 03-2009 | 03-2010 | 03-2011 | 03-2012 | 03-2013 | 03-2014 | 03-2015 | 03-2016 | 03-2017 | 03-2018 | 03-2019 | 04-2009 | 04-2010 | 04-2011 | 04-2012 | 04-2013 | 04-2014 | 04-2015 | 04-2016 | 04-2017 | 04-2018 | 04-2019 | 05-2009 | 05-2010 | 05-2011 | 05-2012 | 05-2013 | 05-2014 | 05-2015 | 05-2016 | 05-2017 | 05-2018 | 05-2019 | 06-2009 | 06-2010 | 06-2011 | 06-2012 | 06-2013 | 06-2014 | 06-2015 | 06-2016 | 06-2017 | 06-2018 | 06-2019 | 07-2009 | 07-2010 | 07-2011 | 07-2012 | 07-2013 | 07-2014 | 07-2015 | 07-2016 | 07-2017 | 07-2018 | 07-2019 | 08-2009 | 08-2010 | 08-2011 | 08-2012 | 08-2013 | 08-2014 | 08-2015 | 08-2016 | 08-2017 | 08-2018 | 08-2019 | 09-2009 | 09-2010 | 09-2011 | 09-2012 | 09-2013 | 09-2014 | 09-2015 | 09-2016 | 09-2017 | 09-2018 | 09-2019 | 10-2009 | 10-2010 | 10-2011 | 10-2012 | 10-2013 | 10-2014 | 10-2015 | 10-2016 | 10-2017 | 10-2018 | 10-2019 | 11-2009 | 11-2010 | 11-2011 | 11-2012 | 11-2013 | 11-2014 | 11-2015 | 11-2016 | 11-2017 | 11-2018 | 12-2009 | 12-2010 | 12-2011 | 12-2012 | 12-2013 | 12-2014 | 12-2015 | 12-2016 | 12-2017 | 12-2018 |
+------------+-----------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
| CSU1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 1171 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | |
| 1183 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| 1184 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | |
| 1740 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | |
| 1773 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
+------------+-----------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
Then finally wrote this function to count the months with consecutive ordering.
def count_consec(vert,columns):
max_val = 0
cur_max_val = 0
total_val = 0
current_val = 0
for x in columns:
if vert[x] == 1 and current_val == 1:
current_val = 1
cur_max_val += 1
total_val += 1
if cur_max_val > max_val:
max_val = cur_max_val
elif vert[x] == 1:
cur_max_val +=1
current_val = 1
total_val += 1
else:
current_val = 0
cur_mav_val = 0
return max_val+1
df_retention_count = pd.crosstab(retention['SA_DACCNT'], retention['Month_year']).reset_index()
columns = df_retention_count.columns
df_retention_count["max_month"] = df_retention_count.apply(lambda x: count_consec(x,columns),axis=1)