I trained multiple models with different configuration for a custom hyperparameter search. I use pytorch_lightning and its logging (TensorboardLogger). When running my training script after Task.init() ClearML auto-creates a Task and connects the logger output to the server.
I log for each straining stage train
, val
and test
the following scalars at each epoch: loss
, acc
and iou
When I have multiple configuration, e.g. networkA
and networkB
the first training log its values to loss
, acc
and iou
, but the second to networkB:loss
, networkB:acc
and networkB:iou
. This makes values umcomparable.
My training loop with Task initalization looks like this:
names = ['networkA', networkB']
for name in names:
task = Task.init(project_name="NetworkProject", task_name=name)
pl_train(name)
task.close()
method pl_train is a wrapper for whole training with Pytorch Ligtning. No ClearML code is inside this method.
Do you have any hint, how to properly use the usage of a loop in a script using completly separated tasks?
Edit: ClearML version was 0.17.4. Issue is fixed in main branch.