0

I'm inheriting from sklearn.ensemble import RandomForestClassifier, and I'm trying to print my new estimator:

class my_rf(RandomForestClassifier):
    def __str__(self):
        return "foo_" + RandomForestClassifier.__str__(self) 

gives foo_my_rf()

I also tried:

class my_rf(RandomForestClassifier):
    def __str__(self):
        return "foo_" + super(RandomForestClassifier, self).__str__() 

with the same result. expected is something pretty like sklearn default behaviour:

>>> a = RandomForestClassifier()
>>> print a
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
        max_depth=None, max_features='auto', max_leaf_nodes=None,
        min_samples_leaf=1, min_samples_split=2,
        min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
        oob_score=False, random_state=None, verbose=0,
        warm_start=False)
>>>

This is also the result when I use print a.__str__().

What am I missing? Thanks.

related to How do I change the string representation of a Python class?

Community
  • 1
  • 1
ihadanny
  • 4,377
  • 7
  • 45
  • 76
  • 2
    Evidently the parent class `__str__` implementation is the name of the class. You are calling it correctly. – jonrsharpe Apr 26 '16 at 08:29
  • @jonrsharpe - oops, edited the question to make clear what am I looking for. – ihadanny Apr 26 '16 at 08:36
  • Have you tried looking at the `__repr__` instead? – jonrsharpe Apr 26 '16 at 08:37
  • @jonrsharpe - yes, same result. I must be missing something with objects vs. classes, in python I always do :) – ihadanny Apr 26 '16 at 08:40
  • Not necessarily - does a base `RandomForestClassifier` give the result you're looking for when `str`/`repr`d? If not, you'll have to write it all yourself. – jonrsharpe Apr 26 '16 at 08:41
  • What does `str` produce if yo *don't* override `__str__` and what is your expected output to achieve? And what's the significance of the added code with the massive amount of parameters? – 5gon12eder Apr 26 '16 at 08:43
  • @5gon12eder - if I don't override, it's just `my_rf()`. If I use the `__str__` of the base `RandomForestClassifier` I get the nice representation with all the parameters, which is what I want – ihadanny Apr 26 '16 at 08:49
  • I see, I didn't realize the code you were showing was supposed to be *output*. I have edited your post to make this clearer. – 5gon12eder Apr 26 '16 at 08:59
  • 1
    Shouldn't it be `super(my_rf, self).__str__`? By specifying `super(RandomForestClassifier, self)`, you are effectively **skipping** `RandomForestClassifier`'s implementation of `__str__`. – user4815162342 Apr 26 '16 at 09:13

1 Answers1

0

In RandomForestClassifier both __repr__ and __str__ lookup the name of the class of the instance they are called from (self). You should directly reference the name of the superclass.

Update This is how you can get your desired output, though I don't get, why would you want something like that. There is a reason why RandomForestClassifier's __str__ and __repr__ return the actual name of a class. That way you can eval to restore the object. Anyway,

In [1]: from sklearn.ensemble import RandomForestClassifier
In [2]: class my_rf(RandomForestClassifier):
    def __str__(self):
        superclass_name = RandomForestClassifier.__name__
        return "foo_" + superclass_name + "(" + RandomForestClassifier.__str__(self).split("(", 1)[1]

In [3]: forest = my_rf()
In [4]: print forest
foo_RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini', max_depth=None,
   max_features='auto', max_leaf_nodes=None, min_samples_leaf=1,
   min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10,
   n_jobs=1, oob_score=False, random_state=None, verbose=0,
   warm_start=False)

Update 2 You get no parameters when you override __init__, because in the superclass __str__ and __repr__ are implemented to scan the list of arguments passed to __init__. You can clearly see it by running this code:

In [5]: class my_rf(RandomForestClassifier):
    def __init__(self, *args, **kwargs):
        RandomForestClassifier.__init__(self, *args, **kwargs)
    def __str__(self):
        superclass_name = RandomForestClassifier.__name__
        return "foo_" + superclass_name + "(" + RandomForestClassifier.__str__(self).split("(", 1)[1]
In [6]: forest = my_rf()
In [7]: print forest
...
RuntimeError: scikit-learn estimators should always specify their parameters in the signature of their __init__ (no varargs). <class '__main__.my_rf'> with constructor (<self>, *args, **kwargs) doesn't  follow this convention.
Eli Korvigo
  • 10,265
  • 6
  • 47
  • 73
  • While this is not what the OP has asked for, it might be what they actually want. – 5gon12eder Apr 26 '16 at 08:33
  • oops, edited the question to make clear what am I looking for. – ihadanny Apr 26 '16 at 08:36
  • @EliKorvigo, sorry - this gives `foo_RandomForestClassifier()` and NOT the correct representation with all the estimator's parameter. Perhaps this is something specific to `sklearn`... – ihadanny Apr 26 '16 at 08:52
  • 1
    @ihadanny It clearly does give you the correct representation with all the estimator's parameters as shown in my example. – Eli Korvigo Apr 26 '16 at 08:54
  • @EliKorvigo - you are right. In my case it does not work because I did another thing, I overided init as well : ` def __init__(self): RandomForestClassifier.__init__(self, n_estimators=25, n_jobs=12, oob_score=False, max_features='sqrt', min_samples_leaf=1)` – ihadanny Apr 26 '16 at 08:58
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/110229/discussion-between-ihadanny-and-eli-korvigo). – ihadanny Apr 26 '16 at 08:59