1

I'm trying to add a "Monthly orders" column which calculates how many transactions a customer with specific id had within a specific CohortDate.

Basically, it's a COUNTIFS function where RANGES are all IDS and all CohortDates equal to ID and CohortDate of any given row.

Any help is much appreciated.

import pandas as pd
import numpy as np

df = pd.DataFrame({'order_id': [75054,75057,75059,75061,75066],
                   'customer_id': [101692,101694,101734,101692,101694],
                   'CohortDate': ['2016-05','2016-05','2016-05','2016-05','2016-06'] 
                  })

The result I would aim to get is the following:

order_id    customer_id    CohortDate    Monthly_orders

75054    101692    '2016-05'    2

75057    101694    '2016-05'    1

75059    101734    '2016-05'    1

75061    101692    '2016-05'    2

75066    101694    '2016-06'    1
tohach
  • 63
  • 6
  • 1
    Please have a look at [*How to make good reproducible pandas examples*](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). Then edit the question and provide a sample of your data and the expected output (in text format). – Alexandre B. Aug 06 '19 at 21:45
  • what's the specific logic? – Umar.H Aug 06 '19 at 22:06
  • In excel the logic for the second row would be as followed =COUNTIFS(C:C;C2;B:B;B2) Column C is Cohort Date Column B is CustomerID – tohach Aug 06 '19 at 22:12
  • 1
    your output doesn't match the logic, row 5 is 2 on my excel, the above would be `df.groupby(['customer_id','CohortDate'])['customer_id'].transform('count')` – Umar.H Aug 06 '19 at 22:21
  • my bad, corrected it. thank you for the answer! – tohach Aug 06 '19 at 22:26

1 Answers1

2

to group by certain variables, we can use the transform which applies the groupby accross a series rather than returning a new dataframe.

df.groupby(['customer_id','CohortDate'])['customer_id'].transform('count')

this returns the count across the original data frame.

order_id    customer_id CohortDate  count
0   75054   101692  2016-05 2
1   75057   101694  2016-05 1
2   75059   101734  2016-05 1
3   75061   101692  2016-05 2
4   75066   101694  2016-06 1
Umar.H
  • 22,559
  • 7
  • 39
  • 74
tohach
  • 63
  • 6