3

I have a dataframe as below

customerid                                   term   age
08a858899538ddb8e015390510b321f0830199897     30     24
18a858959537a097401537a4e316e25f730196361     60     72
a8589c253ace09b0153af6ba58f1f313019822366     45     38

I am creating an entity as below using featuretools

es = es.entity_from_dataframe(entity_id = 'cust', dataframe = df, index = 'customerid')

but i get the error

AssertionError: Index is not unique on dataframe (Entity cust)

yet customerid is the identifier

Max Kanter
  • 2,006
  • 6
  • 16
Ian Okeyo
  • 299
  • 1
  • 4
  • 7
  • 1
    Is it possible that `customerid` is not unique? You can check if that is the case with `len(es) == es.customerid.nunique()` (if it's False, then you have repeated customerids). – dataista Nov 20 '18 at 14:15
  • 1
    @JulianPeller is right. You should run that check on the dataframe not the entityset though, so the code would be `len(df) == df.customerid.nunique()` – Max Kanter Nov 20 '18 at 14:45
  • Oh, you are right. My mistake. – dataista Nov 20 '18 at 14:48
  • 1
    This error also pops up when there is no ID to begin with in the first place. In these cases, passing an arbitrary index value with ```index='...'``` and ```make_index=True``` is the solution. – Uzay Macar Jul 11 '19 at 08:05

0 Answers0