1

I want to do hyperparameter tuning for a neural net, created with keras. For this project I handle my config.yaml files with hydra, use mlflow to store the metrics and parameters from the optimization and use ray to parallelize the computation of the optimization.

This is the first time I work with this tools, so I made a bit research and tried easy examples to get familiar with it.

  1. I tried this example for getting familiar with ray tune and the keras autologger. Everything is working fine!
  2. Then I tried this example to get familiar with ray tune and the MLFlowLoggerCallback. Everything is working!
  3. I implemented the MLFlowLoggerCallback within the first example. This also works well!

In all this examples the param_space is represented by a dictionary like following:

    param_space={
        "threads": 2,
        "lr": tune.uniform(0.001, 0.1),
        "momentum": tune.uniform(0.1, 0.9),
        "hidden": tune.randint(32, 512),
    }

So far so good! But, because I have organized my project configuration with Hydra and some yaml files, I want to use Hydra also to use within ray tune. The config structure looks like:

config:
 - data
  - ...
 - ml
  - model_1_static.yaml
  - model_1_tune.yaml
 - config.yaml

The representative yaml files for this task are within the ml folder. The model_1_tune.yaml is structured as follows:

EPOCHS: 20
LOSS: mse
MODEL_DENSE:
   NUM_LAYERS: tune.randint(2, 5)             
   LEARNING_RATE: tune.uniform(0.001, 0.1)
   ACTIVATION:...
   ...

So here is also the challenge to evaluate the content before passing to the tuning process, because I want to define all parameters (incl. the randint, uniform ...) within one file.

On the other hand I have the model_1_static.yaml to define the parameters for the model, if I do not a parameter optimization. Thanks to hydra it is super easy to change the filename within the config.yaml and the model_1_static.yaml will be considered. This file looks as follows:

EPOCHS: 20
LOSS: mse
MODEL_DENSE:
   NUM_LAYERS: 3           
   LEARNING_RATE: 0.003
   ACTIVATION:...
   ...

To build the model I wrote a little class, which takes the cfg and builds the wished model. It looks like (minimal example):

class ModelCNN():
   def __init__(self, cfg):
      self.EPOCHS = cfg.ml.EPOCHS 
      self.NUM_LAYERS_DENSE = cfg.ml.MODEL_DENSE.NUM_LAYERS
      self.ACTIVATION_DENSE = cfg.ml.MODEL_DENSE.ACTIVATION

   def build_model(self):
      ...
      dense = layers.Dense(64, activation=self.ACTIVATION_DENSE)
      x = dense(inputs)
      ...

So my preferred solution would be to only use this one class to 1) do hyperparameter optimization and 2) build a simple model. But it only works for building a simple model (when I pass model_1_static.yaml). If I want to do hyperparameter optimization with ray tune and mlflow (like in the above given examples), it is not working, because the omegaconf dicts from hydra are transformed within ray tune to normal python dicts. So I cannot address the values in the dict with cfg.ml.MODEL_DENSE.NUM_LAYERS anymore. Also the tune strings within the .yaml must be evaluated.

Maybe I am blind in this case ... Is there a simple solution for this problem? Maybe a good example to combine the 3 tools ray tune, mlflow and Hydra successfully?

Thanks for help! Best Regards Patrick

Patrick
  • 11
  • 2

0 Answers0