Highest Voted 'scikit-learn-pipeline' Questions

43

votes

4 answers

Invalid parameter for sklearn estimator pipeline

I am implementing an example from the O'Reilly book "Introduction to Machine Learning with Python", using Python 2.7 and sklearn 0.16. The code I am using: pipe = make_pipeline(TfidfVectorizer(), LogisticRegression()) param_grid =…

asked Jan 27 '17 at 16:55

sudo_coffee

888
1
12
26

41

votes

4 answers

return coefficients from Pipeline object in sklearn

I've fit a Pipeline object with RandomizedSearchCV pipe_sgd = Pipeline([('scl', StandardScaler()), ('clf', SGDClassifier(n_jobs=-1))]) param_dist_sgd = {'clf__loss': ['log'], 'clf__penalty': [None, 'l1', 'l2',…

python machine-learning scikit-learn cross-validation scikit-learn-pipeline

asked May 08 '17 at 19:56

spies006

2,867
2
19
28

25

votes

2 answers

Is it possible to toggle a certain step in sklearn pipeline?

I wonder if we can set up an "optional" step in sklearn.pipeline. For example, for a classification problem, I may want to try an ExtraTreesClassifier with AND without a PCA transformation ahead of it. In practice, it might be a pipeline with an…

python machine-learning scikit-learn scikit-learn-pipeline

asked Oct 09 '13 at 03:34

dolaameng

1,397
2
17
24

10

votes

3 answers

How to gridsearch over transform arguments within a pipeline in scikit-learn

My goal is to use one model to select the most important variables and another model to use those variables to make predictions. In the example below I am using two instances of RandomForestClassifier, but the second model could be any other…

python machine-learning scikit-learn scikit-learn-pipeline

asked Apr 19 '14 at 20:13

Jason Sanchez

477
2
6
19

5

votes

2 answers

How to create pandas output for custom transformers?

There are a lot of changes in scikit-learn 1.2.0 where it supports pandas output for all of the transformers but how can I use it in a custom transformer? In [1]: Here is my custom transformer which is a standard scaler: from sklearn.base import…

python machine-learning scikit-learn scikit-learn-pipeline

asked Jan 06 '23 at 03:14

Armando Bridena

237
3
10

5

votes

2 answers

Sklearn Pipeline: How to build for kmeans, clustering text?

I have text as shown : list1 = ["My name is xyz", "My name is pqr", "I work in abc"] The above will be training set for clustering text using kmeans. list2 = ["My name is xyz", "I work in abc"] The above is my test set. I have built a vectorizer…

python machine-learning scikit-learn k-means scikit-learn-pipeline

asked Nov 13 '14 at 07:01

user1452759

8,810
15
42
58

4

votes

1 answer

How can I check the changes made by Scikit-Learn Pipeline?

This is a very straightforward question, but I couldn't find the answer anywhere. I tried Google, TDS, Analytics Vidhya, StackOverflow, etc... so, here's the thing, I'm using Scikit-Learn Pipelines, but I wanted to see how my data was treated by the…

python scikit-learn pipeline scikit-learn-pipeline

asked Aug 14 '21 at 13:12

Yuxxxxxx

203
1
5

3

votes

0 answers

How to use different feature set on for each estimator in a Multi estimator sklearn pipeline

Below is an example sklearn pipeline. There are two sklearn StackingClassifiers: stackingclassifier1 with base classifier as RandomForestClassifier & stackingclassifier2 as Meta Learner. stackingclassifier2 with base classifier as…

python scikit-learn scikit-learn-pipeline

asked Nov 21 '22 at 15:04

Jyoti Hassanandani

91
5

3

votes

1 answer

SimpleImputer object has no attribute _fit_dtype

I have a trained scikit-learn model pipeline (including a SimpleImputer) that I'm trying to put into production. However, I get the following error when running it in the production environment. SimpleImputer object has no attribute _fit_dtype How…

python scikit-learn dev-to-production scikit-learn-pipeline

asked Nov 17 '22 at 23:04

Jakob

663
7
25

3

votes

1 answer

How can I get features names when there is a preprocessor before feature selection?

I tried checking some posts like this, this and this but I still couldn't find what I need. These are the transformations I'm doing: cat_transformer = Pipeline(steps=[("encoder", TargetEncoder())]) num_transformer = Pipeline( steps=[ …

python machine-learning scikit-learn feature-selection scikit-learn-pipeline

asked Nov 08 '22 at 13:15

dsbr__0

241
1
3

3

votes

0 answers

How to fit Sklearn Pipeline on Catboost Classifier with Embedding features

I have a Catboost Classifier that predicts on some embedding features, and AFAIK these embedding features can only be specified through Pools (meaning I have to create a pool and then pass the pool for the Catboost classifier's .fit method in order…

scikit-learn embedding catboost scikit-learn-pipeline

asked Jul 04 '22 at 10:47

Edouard Malet

51
1

3

votes

1 answer

How to train an sklearn pipeline in AWS?

Working within a Sagemaker Jupyter Notebook I have an XGBoost pipeline which transforms my data and also runs some feature selection: steps_xgb = [('scaler', MinMaxScaler()), ('feature_reduction', SelectKBest(mutual_info_classif)), …

amazon-web-services scikit-learn xgboost amazon-sagemaker scikit-learn-pipeline

asked Mar 24 '22 at 23:59

quantumofnolace

125
7

2

votes

2 answers

How can I use sklearn's make_column_selector to select all valid datetime columns?

I want to select columns based on their datetime data types. My DataFrame has for example columns with types np.dtype('datetime64[ns]'), np.datetime64 and 'datetime64[ns, UTC]'. Is there a generic way to select all columns with a datetime…

python datetime scikit-learn scikit-learn-pipeline

asked Aug 28 '23 at 16:40

JAdel

1,309
1
7
24

2

votes

2 answers

Error finding attribute `feature_names_in_` that exists in docs

I'm getting the error AttributeError: 'LogisticRegression' object has no attribute 'feature_names_in_' even though that attribute is written in the docs. I'm on scikit-learn version 1.0.2. I created an object LogisticRegression and I am trying to…

pandas numpy scikit-learn logistic-regression scikit-learn-pipeline

asked Dec 12 '22 at 16:57

sanderlin2013

31
6

2

votes

1 answer

How to preserve column names in scikit-learn ColumnTransformer?

I', creating some pipelines using scikit-learn but I'm having some trouble keeping the variables names as the original names, and not as the transformer_name__feature_name format This is the scenario: I have a set of transformers, both custom and…

python scikit-learn scikit-learn-pipeline

asked Nov 21 '22 at 20:27

Rodrigo A

657
7
23

Questions tagged [scikit-learn-pipeline]