I am working with the lifetimes library to build a customer lifetime value model. The library comes with a method called conditional_expected_number_of_purchases_up_to_time
that allows you to predict purchases for each customer in your data set over a specified time period.
Here is the dataframe I am working with:
df = pd.DataFrame([[aaa@email.com, 6.0, 112.0, 139.0], [bbb@email.com, 11.0, 130.0, 130.0]], columns=['email', 'frequency', 'recency', 'T'])
Each row in the dataframe represents an individual customer. To predict the number of expected purchases for each customer over the next 4 periods, I would execute the following code:
t = 4
df['est_purchases'] = mbgf.conditional_expected_number_of_purchases_up_to_time(t, df['frequency'], df['recency'], df['T'])
What I would like to do now is, for each row in the dataframe, approximate the total number of remaining purchases over the rest of their lifetime. Let's call this quantity Residual Customer Purchases (RCP).
To do this, I have defined two functions: the first calculates the incremental RCP between two time periods and the second function approximates the total RCP by incrementally increasing t
until the incremental RCP falls below a specific tolerance level:
## Function to calculate incremental RCP
def RCP(row):
dif = (mbgf.conditional_expected_number_of_purchases_up_to_time(t,
row['frequency'], row['recency'], row['T'])
- mbgf.conditional_expected_number_of_purchases_up_to_time((t-1),
row['frequency'], row['recency'], row['T']))
return dif
## Create column for incremental RCP
df['m_RCP'] = df.apply(RCP, axis = 1)
## Function to approximate total RCP
def approximate(fn, model, rfm, t=1, eps_tol=1e-6, eps=0, **kwargs):
eps = 0
cf = 0
while True:
cf += df.apply(fn, axis = 1)
if(cf - eps < eps_tol):
break
eps = cf; t+=1
return cf
## Create column for total RCP
df['t_RCP'] = df.apply(approximate(RCP, model = mbgf, rfm = df), axis = 1)
The first function is working as expected. But when I try to execute the second function (approximate
) I get this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I want the approximate function to iterate the RCP function for a single row until the RCP value no longer increases, and do this one by one for each row in the dataframe.
What am I doing wrong and what should I be doing instead?