1

Please explain the difference between scikit-learn's ColumnTransformer and make_Column_transformer. Also, where to use what.

Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51
Rajeev R
  • 57
  • 6

2 Answers2

4

There is no such major difference between the two. They both give the same result. as you can see in docs ColumnTransformer uses a list of a tuple with a name and make_column_transformer is just a tuple without a name. Name given to tuple is helpful when we use Gridsearchcv or Randomsearchcv, the estimator in these can be nested pipelines of transformers and classifier and a regressor if we went to give the param_grid to them, then we can use the name of that tuple. You can see in the StackOverflow question nested pipelines and ColumnTransformer in Gridsearchcv and how naming is helpful. Generally, I use make_columns_transformer if I don't have to use Gridseachcv.

sklearn docs

stackoverflow question

Alex Metsai
  • 1,837
  • 5
  • 12
  • 24
Manish Moond
  • 66
  • 1
  • 6
3

This is well described in the Sklearn API:

This is a shorthand for the ColumnTransformer constructor; it does not require, and does not permit, naming the transformers. Instead, they will be given names automatically based on their types. It also does not allow weighting with transformer_weights.

LazyEval
  • 769
  • 1
  • 8
  • 22