16

I saw both transformer and estimator were mentioned in the sklearn documentation.

Is there any difference between these two words?

Son
  • 877
  • 1
  • 12
  • 22

3 Answers3

17

The basic difference is that a:

  • Transformer transforms the input data (X) in some ways.
  • Estimator predicts a new value (or values) (y) by using the input data (X).

Both the Transformer and Estimator should have a fit() method which can be used to train them (they learn some characteristics of the data). The signature is:

fit(X, y)

fit() does not return any value, just stores the learnt data inside the object.

Here X represents the samples (feature vectors) and y is the target vector (which may have single or multiple values per corresponding sample in X). Note that y can be optional in some transformers where its not needed, but its mandatory for most estimators (supervised estimators). Look at StandardScaler for example. It needs the initial data X for finding the mean and std of the data (it learns the characteristics of X, y is not needed).

Each Transformer should have a transform(X, y) function which like fit() takes the input X and returns a new transformed version of X (which generally should have same number samples but may or may not have same features).

On the other hand, Estimator should have a predict(X) method which should output the predicted value of y from the given X.

There will be some classes in scikit-learn which implement both transform() and predict(), like KMeans, in that case carefully reading the documentation should solve your doubts.

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132
  • Your answer is helpful. But I still have a question: is estimators = transformers + predictors? – blacksheep Nov 06 '19 at 06:03
  • @blacksheep What do you mean by predictors? – Vivek Kumar Nov 06 '19 at 07:53
  • estimators that are capable of making predictions – blacksheep Nov 06 '19 at 13:02
  • so is estimators = transformers + predictors? buddy? – blacksheep Nov 15 '19 at 06:02
  • @VivekKumar Very intuitive description, thanks for you answer! Is there any concrete design decision based on which some classes having both these methods, as you mention? Do these classes have a specific name because they're a mix of transformer and estimator? Or do you think it is a bit vague to have that design decision? – Rajdeep Biswas Jun 09 '20 at 17:20
  • @RajdeepBiswas, no there is no specific name for them. Its not vague. The reason is that the particular class can also be used to transform data and also be used to predict/estimate a target based on the data. For example, `KMeans` can transform the data into a cluster-distance matrix (euclidean distance of each point in X to each cluster center), and also used for assigning a cluster to each point in X (clusterting). – Vivek Kumar Jun 09 '20 at 17:41
  • @RajdeepBiswas If you want to identify all such classes where you have transform() as well as predict(), see [my answer here](https://stackoverflow.com/a/41853264/3374996). You would need to additionally filter for `TransformerMixin` here and change `ClassifierMixin` to any of (`ClassifierMixin`, `RegressorMixin`, `ClusterMixin`) – Vivek Kumar Jun 09 '20 at 17:48
  • @VivekKumar makes perfect sense. Could you provide some more examples of such classes? I would like to dive into the documentation to learn for myself how each of their methods are different. EDIT: I finished typing this comment JUST as you posted your last comment, so thanks, I have my answer :) – Rajdeep Biswas Jun 09 '20 at 17:49
  • @RajdeepBiswas Let me know if any more question. More preferably, add a question with approproiate details – Vivek Kumar Jun 09 '20 at 17:49
  • @VivekKumar I do have another question, Vivek. Since my account isn't permitted to post questions so I will just ask you here. [Here](https://scikit-learn.org/stable/data_transforms.html), transformers are listed as ones that are JUST used for data preprocessing. But to my understanding, anything that has a transform() method should be a transformer too, right? So why are things like KMeans and PLSRegression missing from that list? Note: I did not find any separate page listing transformers that belong to training class(es). – Rajdeep Biswas Jun 09 '20 at 17:57
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/215609/discussion-between-rajdeep-biswas-and-vivek-kumar). – Rajdeep Biswas Jun 09 '20 at 18:04
  • 2
    @VivekKumar .. I just answered the question after exploring sklearn code. ou are almost correct. But, not entirely – rawwar Nov 29 '20 at 16:39
  • @InAFlash Yes, I agree completely. +1 for answering from scikit classes point of view. In my answer I was not going for API or classes but from a general discussion present in the scikit documentation. – Vivek Kumar Nov 30 '20 at 16:13
2

Transformer is a type of Estimator that implements transform method.

Let me support that statement with examples I have come across in sklearn implementation.

  1. Class sklearn.preprocessing.FunctionTransformer :

This inherits from two other classes TransformerMixin, BaseEstimator

  1. Class sklearn.preprocessing.PowerTransformer :

This also inherits from TransformerMixin, BaseEstimator

From what I understand, Estimators just take data, do some processing, and store data based on logic implemented in its fit method.

Note: Estimator's aren't used to predict values directly. They don't even have predict method in them.

Before I give more explanation to the above statement, let me tell you about Mixin Classes.

Mixin Class: These are classes that implement a Mix-in design pattern. Wikipedia has very good explanation about it. You can read it here . To summarise, these are classes you write which have methods that can be used in many different classes. So, you write them in one class and just inherit in many different classes(A form of composition. Read These Links - Link1 Link2)

In Sklearn there are many mixin classes. To name a few ClassifierMixin, RegressorMixin, TransformerMixin.

Here, TransformerMixin is the class that's inherited by every Transformer used in sklearn. TransformerMixin class has only one method which is reusable in every transformer and that is fit_transform.

All transformers inherit two classes, BaseEstimator(Which has fit method) and TransformerMixin(Which has fit_transform method). And, Each transformer has transform method based on its functionality

I guess that gives an answer to your question. Now, let me answer the statement I made regarding the Estimator for prediction.

Every Model Class has its own predict class that does prediction.

Consider LinearRegression, KNeighborsClassifier, or any other Model class. They all have a predict function declared in them. This is used for prediction. Not the Estimator.

rawwar
  • 4,834
  • 9
  • 32
  • 57
1

The sklearn usage is perhaps a little unintuitive, but "estimator" doesn't mean anything very specific: basically everything is an estimator.

From the sklearn glossary:

estimator:

An object which manages the estimation and decoding of a model...

Estimators must provide a fit method, and should provide set_params and get_params, although these are usually provided by inheritance from base.BaseEstimator.

transformer:

An estimator supporting transform and/or fit_transform...

As in @VivekKumar's answer, I think there's a tendency to use the word estimator for what sklearn instead calls a "predictor":

An estimator supporting predict and/or fit_predict. This encompasses classifier, regressor, outlier detector and clusterer...

Ben Reiniger
  • 10,517
  • 3
  • 16
  • 29
  • fit_transoform is supported with the help of mixin class. – rawwar Nov 30 '20 at 16:50
  • 1
    @InAFlash, yes, from a code-base perspective most estimators and transformers are established by using the mixins `BaseEstimator` and `TransformerMixin` (and I upvoted your answer for that). But I think the glossary gives a slightly more broad definition, and clarifies their distinction between "estimator" and "predictor". – Ben Reiniger Nov 30 '20 at 17:29