I have a problem, I get the following error RuntimeError: CUDA out of memory.
.
I have already visited the following solutions: RuntimeError: CUDA out of memory. How setting max_split_size_mb?, Pytorch RuntimeError: CUDA out of memory with a huge amount of free memory, How to solve RuntimeError: CUDA out of memory?. However, this does not help. How can I solve this error?
I also tried
!export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:4000'
Is there an option to somehow split this implementation?
with autocast(config.DEVICE):
for i, t in tqdm(enumerate(scheduler.timesteps)):
latent_model_input = torch.cat([latents] * 2)
sigma = scheduler.sigmas[i]
latent_model_input = latent_model_input / ((sigma**2 + 1) ** 0.5)
with torch.no_grad():
noise_pred = unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
noise_pred = noise_pred_uncond + config.GUIDANCE_SCALE * (noise_pred_text - noise_pred_uncond)
latents = scheduler.step(noise_pred, i, latents).prev_sample
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-50-c064e16dfce6> in <module>
7
8 with torch.no_grad():
----> 9 noise_pred = unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
10
11 noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
11 frames
/usr/local/lib/python3.7/dist-packages/torch/functional.py in einsum(*args)
358 return einsum(equation, *_operands)
359
--> 360 return _VF.einsum(equation, operands) # type: ignore[attr-defined]
361
362
RuntimeError: CUDA out of memory. Tried to allocate 35.60 GiB (GPU 0; 39.59 GiB total capacity; 7.00 GiB already allocated; 30.62 GiB free; 7.24 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF