According to the Keras Tuner examples here and here, if you want to define the number of layers and each layer's units in a deep learning model using hyper parameters you do something like this:
for i in range(hp.Int('num_layers', 1, 10)):
model.add(layers.Dense(units=hp.Int('unit_' + str(i), 32, 512, 32)))
However, as others have noted here and here after the oracle has seen a model with num_layers = 10
it will always assign a value to unit_0
through unit_9
, even when num_layers
is less than 10.
In the case that num_layers = 1
for example, only unit_0
will be used to build the model. But, unit_1
through unit_9
will be defined and active in the hyper parameters.
Does the oracle "know" that unit_1
through unit_9
weren't actually used to build the model (and therefore disregard their relevance for impacting the results of that trial)?
Or, does it assume unit_1
through unit_9
are being used because they have been defined (and calling hp.get('unit_9')
for example will return a value)?
In the latter case the oracle is using misinformation to drive the tuning process. As a result it will take longer to converge (at best) and incorrectly converge to a solution as a result of assigning relevance to the unused hyper parameters (at worst).
Should the model actually be defined using conditional scopes, like this?
num_layers = hp.Int('num_layers', 1, 10)
for i in range(num_layers):
with hp.conditional_scope('num_layers', list(range(i + 1, 10 + 1))):
model.add(layers.Dense(units=hp.Int('unit_' + str(i), 32, 512, 32)))
When defining the model like this, if num_layers < 10
, calling hp.get('unit_9')
will return a ValueError: Conditional parameter unit_10 is not currently active
, as expected.