0
for i in df.columns:
    if df[i].dtype=='object' and df[i].nunique()>20:
        df=df.drop(columns=[i],axis=1)

The above for loop works as expected, but when i put this under a def function it doesn't work, for example- data= Churn_Modelling, the above for loop was able to drop "Surname" but the below def function drop_object_nunique(df) doesn't removes "Surname"

def drop_object_nunique(data):
    for i in data.columns:
        if data[i].dtype=='object' and data[i].nunique()>20:
            data=data.drop(columns=[i],axis=1)

2 Answers2

0

If I understand you correctly, it is a workaround of "Python always does pass-by-value" and the scope of your data frame.

Refer to this answer: https://stackoverflow.com/a/38925257/15017770

0

I'm not that familiar with pandas, but from your code, it appears that drop makes a copy of the object rather than modifying the object in place.

def drop_object_nunique(data):
    for i in data.columns:
        if data[i].dtype=='object' and data[i].nunique()>20:
            data=data.drop(columns=[i],axis=1)
    return data

data = drop_object_nunique(data)
Frank Yellin
  • 9,127
  • 1
  • 12
  • 22