Hyperparameter optimization for Pytorch model

Question

What is the best way to perform hyperparameter optimization for a Pytorch model? Implement e.g. Random Search myself? Use Skicit Learn? Or is there anything else I am not aware of?

richliaw · Answer 1 · 2020-08-26T00:07:37.040

Many researchers use RayTune. It's a scalable hyperparameter tuning framework, specifically for deep learning. You can easily use it with any deep learning framework (2 lines of code below), and it provides most state-of-the-art algorithms, including HyperBand, Population-based Training, Bayesian Optimization, and BOHB.

import torch.optim as optim
from ray import tune
from ray.tune.examples.mnist_pytorch import get_data_loaders, ConvNet, train, test


def train_mnist(config):
    train_loader, test_loader = get_data_loaders()
    model = ConvNet()
    optimizer = optim.SGD(model.parameters(), lr=config["lr"])
    for i in range(10):
        train(model, optimizer, train_loader)
        acc = test(model, test_loader)
        tune.report(mean_accuracy=acc)


analysis = tune.run(
    train_mnist, config={"lr": tune.grid_search([0.001, 0.01, 0.1])})

print("Best config: ", analysis.get_best_config(metric="mean_accuracy"))

# Get a dataframe for analyzing trial results.
df = analysis.dataframe()

[Disclaimer: I contribute actively to this project!]

I tried that example but I got an error in which I post here: https://stackoverflow.com/questions/62371787/an-error-while-using-tune-for-hyper-parameters-a-deep-learning-model could please help me with that? — LamaMo, Jun 14 '20 at 11:21

Michael D · Answer 2 · 2020-06-10T11:11:52.210

23

What I found is following:

More young projects:

hypersearch limited only to FC layers.
skorch Just grid search available
Auto-PyTorch

UPDATE something new:

Also, I found a useful table at post by @Richard Liaw:

edited Jun 10 '20 at 11:11

answered Nov 07 '19 at 10:40

Michael D

1,711
4
23
38

2

Looks like you've linked to the wrong repo for HyperOpt. [This](https://github.com/hyperopt/hyperopt) is the correct URL. – Krishna Penukonda Jun 08 '20 at 19:35
2

There is also [SHERPA](https://github.com/sherpa-ai/sherpa), which also has a nice table of comparisons. – drevicko Aug 10 '20 at 04:10

score 9 · Answer 3 · answered Jun 03 '17 at 22:50

You can use Bayesian optimization (full disclosure, I've contributed to this package) or Hyperband. Both of these methods attempt to automate the hyperparameter tuning stage. Hyperband is supposedly the state of the art in this space. Hyperband is the only parameter-free method that I've heard of other than random search. You can also look into using reinforcement learning to learn the optimal hyperparameters if you prefer.

score 8 · Answer 4 · answered Aug 01 '17 at 14:59

The simplest parameter-free way to do black box optimisation is random search, and it will explore high dimensional spaces faster than a grid search. There are papers on this but tl;dr with random search you get different values on every dimension each time, while with grid search you don't.

Bayesian optimisation has good theoretical guarantees (despite the approximations), and implementations like Spearmint can wrap any script you have; there are hyperparameters but users don't see them in practice. Hyperband got a lot of attention by showing faster convergence than Naive Bayesian optimisation. It was able to do this by running different networks for different numbers of iterations, and Bayesian optimisation doesn't support that naively. While it is possible to do better with a Bayesian optimisation algorithm that can take this into account, such as FABOLAS, in practice hyperband is so simple you're probably better using it and watching it to tune the search space at intervals.

Hyperparameter optimization for Pytorch model

4 Answers4

Linked