I have basic question about about model weights and checkpoints.
When training a model, each layer in the model graph calls kernel executed on the GPU. These weights remain on the GPU for forward pass and backward pass. Once the weights are updated during backward pass, where are all the updated weights stored. Are they moved back to CPU memory? when does this move happen ?
when checkpointing is done, do we get weights from CPU memory ?
Can someone explain the whole execution flow ?