4

I have a main config file, let's say config.yaml:

num_layers: 4
embedding_size: 512
learning_rate: 0.2
max_steps: 200000

I'd like to be able to override this, on the command-line, with another file, like say big_model.yaml, which I'd use conceptually like:

python my_script.py --override big_model.yaml

and big_model.yaml might look like:

num_layers: 8
embedding_size: 1024

I'd like to be able to override with an arbitrary number of such files, each one taking priority over the last. Let's say I also have fast_learn.yaml

learning_rate: 2.0

And so I'd then want to conceptually do something like:

python my_script.py --override big_model.yaml --override fast_learn.yaml

What is the easiest/most standard way to do this in hydra? (or potentially in omegaconf perhaps?)

(note that I'd like these override files to ideally just be standard yaml files, that override the earlier yaml files, ideally; though if I have to write using override DSL instead, I can do that, if that's the easiest/best/most standard way)

Hugh Perkins
  • 7,975
  • 7
  • 63
  • 71
  • You may be interested in my answer to [this](https://stackoverflow.com/q/67715171) question. – Jasha Jun 08 '21 at 08:49

2 Answers2

2

It sounds like package override might be the a good solution for you.

The documentation can be found here: https://hydra.cc/docs/next/advanced/overriding_packages

an example application can be found here: https://github.com/facebookresearch/hydra/tree/master/examples/advanced/package_overrides

using the example application as an example, you can achieve the override by doing something like

$ python simple.py db=postgresql db.pass=helloworld
db:
  driver: postgresql
  user: postgre_user
  pass: helloworld
  timeout: 10

Jieru Hu
  • 186
  • 2
  • Ok. I've seen the concept of 'config groups', where one can choose eg a specific dataset yaml file, or a specific database yaml file. Is there a way of mixing and matching arbitrary yaml files, without eg creating a folder/group for each of those files? – Hugh Perkins Oct 29 '20 at 16:06
  • You can use this: https://hydra.cc/docs/tutorials/basic/your_first_app/defaults#non-config-group-defaults But this can only be specified in a defaults list (in a file). You can also override the config name via the command line with --config-name, which will allow you to select different default lists. – Omry Yadan Oct 29 '20 at 17:51
  • @OmryYadan ok, can the values specified in the default lists override values in earlier config files? or can those only form new child nodes in the config hierarchy, and thus wont override earlier values? – Hugh Perkins Oct 29 '20 at 20:41
  • Read about the defaults list in the docs. the defaults list is not overriding config values. – Omry Yadan Oct 29 '20 at 20:54
  • @HughPerkins, re-reading your question and my answer - I think this is best addressed in a chat. The content of elements composed via the defaults list can definitely be used to override config values. – Omry Yadan Nov 18 '20 at 18:11
  • @HughPerkins Did you ever resolve whether your original question could be addressed using package overriding? I believe what I'm asking here: https://stackoverflow.com/questions/67715171/fb-hydra-how-to-get-inner-configurations-to-inherit-outer-configuration-fields is very similar to your problem, and it seems to me that package overriding is the only possibility, I think groups don't really address this. – Mike May 29 '21 at 23:07
  • 2
    I switched to using omegaconf, which hydra runs on top of, directly instead, in the end. – Hugh Perkins May 31 '21 at 00:09
1

Refer to the basic tutorial and read about config groups.

You can create arbitrary config groups, and select one option from each (As of Hydra 1.0, config groups options are mutually exclusive), you will need two config groups here: one can be model, with a normal, small and big model, and another can trainer, with maybe normal and fast options.

Config groups can also override things in other config groups. You can also always append to the defaults list from the command line - so you can also add additional config groups that are only used in the command line. an example for that can an 'experiment' config group. You can use it as:

$ python train.py +experiment=exp1

In such config groups that are overriding things across the entire config you should use the global package (read more about packages in the docs).

# @package _global_
num_layers: 8
embedding_size: 1024
learning_rate: 2.0
Omry Yadan
  • 31,280
  • 18
  • 64
  • 87
  • 1
    Thanks! Will have a try :) – Hugh Perkins Oct 30 '20 at 15:18
  • I'd like these files to be fairly arbitrary files on the whole. Like, imagine I have some default configuration for training a model. Then I have a bunch of experiments where I change just certain values. I don't want to create a config group for each experiment. Nor do I want to copy and paste the entire giant config. I'd like to just be able to point to experiment-specific yaml files on the commandline, which will override the main config file. – Hugh Perkins Nov 09 '20 at 08:55
  • maybe I should use omegaconf directly for this? – Hugh Perkins Nov 09 '20 at 08:56
  • I suggested that you create ONE config group for all experiments. files in it can overrides specific values in the base config. – Omry Yadan Nov 10 '20 at 01:58
  • Ah. Makes sense. For now, I've ended up using argparse to read in a list of default configs, additional configs, and manual overrides, then use omegaconf to load the config files, load the manual overrides, and merge these. This works ok. It also avoids issues with working directory changing, and hydra.yaml being saved into hydra. I'm using hydra now uniquely for instantiation. – Hugh Perkins Nov 11 '20 at 02:11
  • @OmryYadan Is the `experiment` config group declared through the directory structure? If so, why is there a `+` sign, isn't that only for introducing new arguments? If it is not declared and is only added over the command line? how can hydra know that `exp1` is the name of a config group and not just a string? Is there maybe a complete example of this approach I could take a look at? – talz Aug 16 '22 at 12:14
  • + is needed if experiment is not mentioned in the defaults list (which is typical for this use case). https://hydra.cc/docs/patterns/configuring_experiments/ – Omry Yadan Aug 16 '22 at 17:28