Questions tagged [grid-search]

In machine learning, grid search refers to multiple runs to find the optimal value of parameter(s)/hyperparameter(s) of a model, e.g. mtry for random-forest or alpha, beta, lambda for glm, or C, kernel and gamma for SVM.

865 questions
57
votes
6 answers

What is the difference between cross-validation and grid search?

In simple words, what is the difference between cross-validation and grid search? How does grid search work? Should I do first a cross-validation and then a grid search?
Linda
  • 2,375
  • 4
  • 30
  • 33
56
votes
2 answers

Sklearn How to Save a Model Created From a Pipeline and GridSearchCV Using Joblib or Pickle?

After identifying the best parameters using a pipeline and GridSearchCV, how do I pickle/joblib this process to re-use later? I see how to do this when it's a single classifier... import joblib joblib.dump(clf, 'filename.pkl') But how do I save…
Jarad
  • 17,409
  • 19
  • 95
  • 154
43
votes
4 answers

Invalid parameter for sklearn estimator pipeline

I am implementing an example from the O'Reilly book "Introduction to Machine Learning with Python", using Python 2.7 and sklearn 0.16. The code I am using: pipe = make_pipeline(TfidfVectorizer(), LogisticRegression()) param_grid =…
sudo_coffee
  • 888
  • 1
  • 12
  • 26
42
votes
10 answers

How to graph grid scores from GridSearchCV?

I am looking for a way to graph grid_scores_ from GridSearchCV in sklearn. In this example I am trying to grid search for best gamma and C parameters for an SVR algorithm. My code looks as follows: C_range = 10.0 ** np.arange(-4, 4) …
kroonike
  • 1,109
  • 2
  • 13
  • 20
40
votes
4 answers

Use sklearn's GridSearchCV with a pipeline, preprocessing just once

I'm using scickit-learn to tune a model hyper-parameters. I'm using a pipeline to have chain the preprocessing with the estimator. A simple version of my problem would look like this: import numpy as np from sklearn.model_selection import…
Marc Garcia
  • 3,287
  • 2
  • 28
  • 37
35
votes
1 answer

Using Smote with Gridsearchcv in Scikit-learn

I'm dealing with an imbalanced dataset and want to do a grid search to tune my model's parameters using scikit's gridsearchcv. To oversample the data, I want to use SMOTE, and I know I can include that as a stage of a pipeline and pass it to…
33
votes
3 answers

Early stopping with Keras and sklearn GridSearchCV cross-validation

I wish to implement early stopping with Keras and sklean's GridSearchCV. The working code example below is modified from How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras. The data set may be downloaded from here. The…
33
votes
5 answers

Is there easy way to grid search without cross validation in python?

There is absolutely helpful class GridSearchCV in scikit-learn to do grid search and cross validation, but I don't want to do cross validataion. I want to do grid search without cross validation and use whole data to train. To be more specific, I…
ykensuke9
  • 714
  • 2
  • 7
  • 15
32
votes
2 answers

Using GridSearchCV with AdaBoost and DecisionTreeClassifier

I am attempting to tune an AdaBoost Classifier ("ABT") using a DecisionTreeClassifier ("DTC") as the base_estimator. I would like to tune both ABT and DTC parameters simultaneously, but am not sure how to accomplish this - pipeline shouldn't work,…
GPB
  • 2,395
  • 8
  • 26
  • 36
30
votes
5 answers

Try multiple estimator in one grid-search

Is there a way we can grid-search multiple estimators at a time in Sklearn or any other library. For example can we pass SVM and Random Forest in one grid search ?.
tj89
  • 3,953
  • 2
  • 12
  • 12
30
votes
3 answers

Random Forest with GridSearchCV - Error on param_grid

Im trying to create a Random Forest model with GridSearchCV but am getting an error pertaining to param_grid: "ValueError: Invalid parameter max_features for estimator Pipeline. Check the list of available parameters with…
OAK
  • 2,994
  • 9
  • 36
  • 49
29
votes
3 answers

Memory leak using gridsearchcv

Problem: My situation appears to be a memory leak when running gridsearchcv. This happens when I run with 1 or 32 concurrent workers (n_jobs=-1). Previously I have run this loads of times with no trouble on ubuntu 16.04, but recently upgraded to…
negfrequency
  • 1,801
  • 3
  • 18
  • 30
27
votes
2 answers

How to pass elegantly Sklearn's GridseachCV's best parameters to another model?

I have found a set of best hyperparameters for my KNN estimator with Grid Search CV: >>> knn_gridsearch_model.best_params_ {'algorithm': 'auto', 'metric': 'manhattan', 'n_neighbors': 3} So far, so good. I want to train my final estimator with these…
Hendrik
  • 1,158
  • 4
  • 15
  • 30
25
votes
2 answers

Does GridSearchCV perform cross-validation?

I'm currently working on a problem which compares three different machine learning algorithms performance on the same data-set. I divided the data-set into 70/30 training/testing sets and then performed grid search for the best parameters of each…
23
votes
1 answer

ImportError: No module named grid_search, learning_curve

Problem with Scikit learn l can't use learning_curve of Sklearn and sklearn.grid_search. When l do import sklearn (it works) from sklearn.cluster import bicluster (it works). i try to reinstall scikit-learn also remain the same issue. I am using…
Andy Hui
  • 420
  • 2
  • 6
  • 17
1
2 3
57 58