File exists: '/opt/ml/input/data/log_dir/story-visualization-0519-v2-768x480-20x12-lr5e-05'
Traceback (most recent call last):
File "/opt/ml/code/deepspeed_tools/abstract_trainer_deepspeed.py", line 249, in <module>
model_engine.save_checkpoint(config.model_dir, epoch, client_state=client_sd)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 2717, in save_checkpoint
os.makedirs(save_dir, exist_ok=True)
File "/opt/conda/lib/python3.8/os.py", line 223, in makedirs
mkdir(name, mode)
I thought if setting exist_ok=True for os.makedirs, it will never raise file exists exception, but still get this error? Any suggestions: os.makedirs(save_dir, exist_ok=True)
Not sure because of my code is running in multi- processors