0

I want to request assistance in deriving the number of transactions for a specific customer for the month at the point of transaction. The objective is to iterate over a dataframe and add a column that shows the number of transactions for a specific customer.

Here is the sample Dataframe I used:

# Load the sample DataFrame
df1 = pd.DataFrame({'Customer_No': ['1861', 
                                       '1860', 
                                       '1861', 
                                       '1861', 
                                       '1861', 
                                       '1860', 
                                       '1861', 
                                       '1861', 
                                       '1860'],
                   'Start_Month': [  '2021-11-01',
                                    '2021-11-01',
                                    '2021-11-01',
                                    '2021-11-01', 
                                    '2021-11-01', 
                                    '2021-12-01', 
                                    '2021-12-01', 
                                    '2021-12-01', 
                                    '2021-11-01'],
                   'Datetime_Created_On': ['2021-11-03 07:27:46.397', 
                                    '2021-11-02 13:20:46.397', 
                                    '2021-11-05 14:15:46.397', 
                                    '2021-11-10 06:22:46.397', 
                                    '2021-11-25 20:56:46.397', 
                                    '2021-12-11 15:34:46.397', 
                                    '2021-12-19 07:43:46.397', 
                                    '2021-12-29 23:21:46.397', 
                                    '2021-11-30 03:14:46.397']
                  })

Here is my attempt to try and achieve my objective:

def get_month_trx_count(df1, start_date, end_date, customer):

    mask = (df1['Datetime_Created_On'] >= start_date) & (df1['Datetime_Created_On'] <= end_date)
    df1 = df1.loc[mask]

    # Group transactions by a unique identifier
    grouped = df1.groupby('Customer_No')

    # Count the number of transactions in each group
    trx_count = grouped[grouped['Customer_No']==customer].count()

    return trx_count


# Filter the DataFrame to only include transactions between two dates
start_date = "2021-01-01"
end_date = "2021-12-31"
customer = '1861'

# Call the function to return count
df['Customer_TRX_Month'] = get_month_trx_count(df1, start_date, end_date, customer)

The desired output should look like this: enter image description here

Any assistance to get this achieved will be appreciated.

N K
  • 147
  • 7
  • 1
    `df1['Customer_TRX_Month'] = df1.sort_values(by='Datetime_Created_On').groupby(['Customer_No', 'Start_Month']).cumcount().add(1)` – mozway Feb 03 '23 at 07:31

0 Answers0