1

My question is simple and divided in two sub-questions :

  • I have two LabelEncoders with the exact same parameter .classes_ . But when I compare then with ==, they seems different. How can I check they are equal ?

  • I also want to check two sklearn model (RandomForest already fitted) are equals ? The == check does not seems to work.

Venkatachalam
  • 16,288
  • 9
  • 49
  • 77
LCMa
  • 445
  • 3
  • 13
  • 1
    You would probably want to subclass `LabelEncoder` and override `__eq__` function. For how to, refer https://stackoverflow.com/questions/390250/elegant-ways-to-support-equivalence-equality-in-python-classes – kumarchandresh Nov 23 '20 at 08:37
  • Thanks for your response. But I am also wondering why the base __eq__ function for LabelEncoder is not returning True when having the same classes : is there an other element taking into account to check equality ? – LCMa Nov 23 '20 at 08:41
  • 1
    scikit-learn has convention to instantiate a model, and the use fit() & transform() methods on it. So, two different objects of a model are not equal, it is by design. You are free to override this behavior, but it is not suitable for a library. – kumarchandresh Nov 23 '20 at 08:48

2 Answers2

1

If you compare two objects with ==, it will return False because they have different id.

You'd better override __eq__ function of class.

illian01
  • 11
  • 2
1

As mentioned in the comments, we need to create our own condition for __eq__. Here is my version:

from sklearn.preprocessing import LabelEncoder
import numpy as np
target = pd.Series(np.random.choice(['yes', 'no'], (20,)))

class MyLabelEncoder(LabelEncoder):
    def __eq__(self, other):
        if np.array_equal(other.classes_,
                                   self.classes_):
            return True
        return False

le1 = MyLabelEncoder().fit(target)
le2 = MyLabelEncoder().fit(target)
le1 == le2
# True

For DecisionTree, there is already a solution provided here. You could extend it for RandomForestClassifier.

Venkatachalam
  • 16,288
  • 9
  • 49
  • 77