How to find the optimal learning rate, number of epochs & decay strategy in Torch.optim.adam?

Question

I am working on a model trained on the MNIST dataset. I am using the torch.optim.adam model and have been experimenting with tuning the hyper parameters. After running a lot of tests, I have come to find a combination of hyper parameters that give 90% accuracy. However, I feel like maybe since I am new to this, there might be a more efficient way to find the optimal values of the hyperparameters. The brute force approach seems to depend on trial and error & I was wondering if there is certain strategy to find these values. Example of the code being used is:

if __name__ == '__main__':
    end = time.time()
    model_ft = Net().to(device) 
    print(model_ft.network)
    criterion = nn.CrossEntropyLoss() 

    optimizer_ft = optim.Adam(model_ft.parameters(), lr=1e-3)

    exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=9, gamma=0.5) 
    
    history, accuracy = train_test(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
               num_epochs=15)

Here I would like to find the optimal values of:-

Learning Rate
Step Size
Gamma
Number of Epochs
Any help is much appreciated!

does this answer your question? https://stackoverflow.com/questions/44260217/hyperparameter-optimization-for-pytorch-model — Miguel Trejo, Dec 13 '21 at 14:36
the link you mentioned had some resources that are worth looking into! thanks a lot — JANVI SHARMA, Dec 15 '21 at 04:42

J.vR · Answer 1 · 2021-12-13T14:41:19.003

1

A similar question was already answered in-depth it seems.

However, in short, you can use something called Grid Search. With Grid Search, you set the values you want to try for each hyperparameter, and then Grid Search will try every combination. This link shows how to do it with PyTorch

The following Medium Post goes more in-depth about other methods and packages to try, but I think you should start with a simple grid search.

edited Dec 13 '21 at 14:41

answered Dec 13 '21 at 12:36

J.vR

21
4

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 13 '21 at 13:41
Thanks for your answer. For grid search, from what I understand, you will have to train your model on certain possible values of hyper params first right? and then run grid search to find the best model? is there a way by which we don't input the possible/guessed hyper param values & the function converges to the optimal values? – JANVI SHARMA Dec 15 '21 at 04:43

How to find the optimal learning rate, number of epochs & decay strategy in Torch.optim.adam?

1 Answers1