2

I'm trying to run a BG/NBD model using the lifetimes libary. All my analysis are based on the following example, yet with my own data: https://towardsdatascience.com/whats-a-customer-worth-8daf183f8a4f

Somehow I receive the following error and after reading 50+ stackoverflow articles without finding any answer, I'd like to ask my own question: What am I doing wrong? :(

Thanks in Advance! :)

I tried to change the type of all columns that are part of my dataframe, without any changes.

df2 = df

df2.head()

person_id   effective_date  accounting_sales_total
0   219333  2018-08-04  1049.89
1   333219  2018-12-21  4738.97
2   344405  2018-07-16  253.99
3   455599  2017-07-14  2199.96
4   766665  2017-08-15  1245.00
from lifetimes.utils import calibration_and_holdout_data

summary_cal_holdout = calibration_and_holdout_data(df2, 'person_id', 'effective_date',
                                        calibration_period_end='2017-12-31',
                                        observation_period_end='2018-12-31')

print(summary_cal_holdout.head())
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-85-cdcb400098dc> in <module>()
      7 summary_cal_holdout = calibration_and_holdout_data(df2, 'person_id', 'effective_date',
      8                                         calibration_period_end='2017-12-31',
----> 9                                         observation_period_end='2018-12-31')
     10 
     11 print(summary_cal_holdout.head())

/usr/local/envs/py3env/lib/python3.5/site-packages/lifetimes/utils.py in calibration_and_holdout_data(transactions, customer_id_col, datetime_col, calibration_period_end, observation_period_end, freq, datetime_format, monetary_value_col)
    122     combined_data.fillna(0, inplace=True)
    123 
--> 124     delta_time = (to_period(observation_period_end) - to_period(calibration_period_end)).n
    125     combined_data["duration_holdout"] = delta_time
    126 

AttributeError: 'int' object has no attribute 'n'
superdell
  • 21
  • 2

1 Answers1

1

This actually runs fine as it is :)

data = {'person_id':[219333, 333219, 344405, 455599, 766665], 
           'effective_date':['2018-08-04', '2018-12-21', '2018-07-16', '2017-07-14', '2017-08-15'],
           'accounting_sales_total':[1049.89, 4738.97, 253.99, 2199.96, 1245.00]} 
df2 = pd.DataFrame(data)


from lifetimes.utils import calibration_and_holdout_data

summary_cal_holdout = calibration_and_holdout_data(df2, 'person_id', 'effective_date',
                                        calibration_period_end='2017-12-31',
                                        observation_period_end='2018-12-31')
print(summary_cal_holdout.head())

Returns:

           frequency_cal  recency_cal  T_cal  frequency_holdout  \
person_id                                                         
455599               0.0          0.0  170.0                0.0   
766665               0.0          0.0  138.0                0.0   

           duration_holdout  
person_id                    
455599                  365  
766665                  365  

Which means your issue is probably with package versioning, try:

pip install lifetimes --upgrade
prp
  • 914
  • 1
  • 9
  • 24