I was trying to follow along Jeremy Howard Fast Ai course. I am on Part 2 of of the course Lecture 9A: https://www.youtube.com/watch?v=0_BBRNYInx8&t=0s
The code I am trying to run can be found below (its the code in the second cell): https://github.com/fastai/diffusion-nbs/blob/master/Stable%20Diffusion%20Deep%20Dive.ipynb
The imports:
from base64 import b64encode
import numpy
import torch
from diffusers import AutoencoderKL, LMSDiscreteScheduler, UNet2DConditionModel
from huggingface_hub import notebook_login
# For video display:
from IPython.display import HTML
from matplotlib import pyplot as plt
from PIL import Image
from torch import autocast
from torchvision import transforms as tfms
from tqdm.auto import tqdm
from transformers import CLIPTextModel, CLIPTokenizer, logging
torch.manual_seed(1)
# if not (Path.home()/'.huggingface'/'token').exists(): notebook_login()
# Supress some unnecessary warnings when loading the CLIPTextModel
logging.set_verbosity_error()
# Set device
print(torch.cuda.is_available())
torch_device = "cuda" if torch.cuda.is_available() else "cpu"
The actual code:
# Load the autoencoder model which will be used to decode the latents into image space.
vae = AutoencoderKL.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="vae")
# Load the tokenizer and text encoder to tokenize and encode the text.
tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14")
text_encoder = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14")
# The UNet model for generating the latents.
unet = UNet2DConditionModel.from_pretrained("CompVis/stable-diffusion-v1-4", subfolder="unet")
# The noise scheduler
scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", num_train_timesteps=1000)
# To the GPU we go!
vae = vae.to(torch_device)
text_encoder = text_encoder.to(torch_device)
unet = unet.to(torch_device);
The Problem: Cuda out of memory error
in the last line I unet.to(torch_device)
I encounter the Cuda out of memory error
I have a 2 GB Nvidia Graphics card (I know this memory is too small, so I tried a few things to fix this)
The solution I tried but it did not work
In the below code
I used set_attention_slice
I set vae
and text_encoder
to "cpu"
## To Prevent CUDA out of memory. Change 1024 to any numbeer
slice_size = unet.config.attention_head_dim // 1024
unet.set_attention_slice(slice_size)
# To the GPU we go!
vae = vae.to("cpu") #vae.to(torch_device)
text_encoder = text_encoder.to("cpu")#text_encoder.to(torch_device)
unet = unet.to(torch_device).to_fp16 # to_fp16 is my code comment
This did not work,
Is there anything else I can try?