Questions tagged [scikit-optimize]

Questions regarding the use of scikit-optimize, a package for model-based sequential optimization.

Questions regarding the use of scikit-optimize, a Python package for model-based sequential optimization of costly black-box functions.

47 questions
5
votes
1 answer

XGBoost and scikit-optimize: BayesSearchCV and XGBRegressor are incompatible - why?

I have a very large dataset (7 million rows, 54 features) that I would like to fit a regression model to using XGBoost. To train the best possible model, I want to use BayesSearchCV from scikit-optimize to run the fit repeatedly for different…
4
votes
1 answer

what is the kappa variable (BayesianOptimization)

I read some posts and tutorials about BayesianOptimization and I never saw explanation about kappa variable. What is the kappa variable ? How can it help us ? How this values can influence the BayesianOptimization process ?
Boom
  • 1,145
  • 18
  • 44
4
votes
2 answers

lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0)

There's a couple of other questions similar to this, but I couldn't find a solution which seems to fit. I am using LightGBM with Scikit-Optimize BayesSearchCV. full_pipeline = skl.Pipeline(steps=[('preprocessor', pre_processor), …
Lucy
  • 179
  • 1
  • 4
  • 14
3
votes
4 answers

TypeError inside the `scikit-optimize` package

When I use scikit-optimize version 0.7.4 to optimize a scikit-learn 0.23 model: rf = BayesSearchCV( RandomForestClassifier( min_samples_leaf=0.01, oob_score=True ), { 'n_estimators': Integer(30, 200), …
2
votes
2 answers

How to fix the 'numpy.int' attribute error when using skopt.BayesSearchCV in scikit-learn?

When I run the following code on the official documentation, it has an error. Minimal example from skopt import BayesSearchCV from sklearn.datasets import load_digits from sklearn.svm import SVC from sklearn.model_selection import…
2
votes
1 answer

Train and test data setup for sklearn

I'm creating a classification model to predict the outcome of sports event(win/loss) and am running into a data setup conundrum. Currently the data is setup as follows: example_data = [team_a_feat_1, team_a_feat_2...team_b_feat_1, team_b_feat_2...…
2
votes
1 answer

How are the test scores in cv_results_ and best_score_ calculated in scikit-optimize?

I'm using BayesSearchCV from scikit-optimize to optimise an XGBoost model to fit some data I have. While the model fits fine, I am puzzled by the scores provided in the diagnostic information and am unable to replicate them. Here's an example script…
2
votes
0 answers

Running multiprocessing Pool.map multiple times in one program ends up blocking

I am trying to optimize a function that is relatively expensive to evaluate. The function operates across a series of data points, and can be evaluated in parallel. Each data point evaluation requires access to global data, so I am using ctype and…
1
vote
0 answers

Skorch NeuralNetRegressor and GridSearchCV - Custom Parameters

I have the following model defined, that I would like to apply Hyperparameter tuning to. I want to use GridSearchCV and change the number of layers etc. class Regressor(nn.Module): def __init__(self, n_layers=3, n_features=10,…
1
vote
0 answers

Subclass EarlyStopper in scikit-optimize

I can't figure out how to subclass EarlyStopper to use it as callback in scikit-optimize (gp_minimize). Based on the documentation. How should I think when subclassinging? Documentation:…
Henri
  • 1,077
  • 10
  • 24
1
vote
1 answer

Output from noisy optimization in skopt.gp_minimize

When using skopt.gp_minimize on a noisy dataset with unknown variance - is the returned minimum the x-values found for one specific sample of data or the minimum of the surrogate function? And either way - is it possible to specify which one is…
1
vote
0 answers

How to run Scikit's gp_minimize in parallel?

I am unable to make skopt.gp_minimize run at multiple cores. According to the documentation, the parameter n_jobs should set the number of cores. However, setting n_cores>1 seems to have no effect. Here is a minimal example that reproduces the…
Botond
  • 2,640
  • 6
  • 28
  • 44
1
vote
0 answers

How to restart BayesSearchCV from a checkpoint

I am performing hyperparameter search over a large space using skopt.BayesSearchCV. I am running it on a machine that restarts at 7pm every day. I want to be able to save the state of the BayesSearchCV so I can restart the search from where I left…
jet457
  • 211
  • 2
  • 5
1
vote
0 answers

Can the ask function in scikit-optimize be parallelized?

I'm using the skopt (scikit-optimize) package, using the ask-tell syntax. I'm using python 3.7, on a windows machine The ask function call takes a long time (first call ~1 minute, then increases 1 minute for each iteration, so ultimately as much as…
1
vote
0 answers

Pipeline that scales, then tunes hyperparameters in a nested RFECV. What am I doing wrong?

I'm trying to build a basic ML pipeline that will select features while tuning hyper parameters at the same time. The code is below. #pipeline for full feature selection - hyperparametertuning starttime = timeit.default_timer() scaler =…
1
2 3 4