18

In training loop, I load a batch of data into CPU and then transfer it to GPU:

import torch.utils as utils

train_loader = utils.data.DataLoader(train_dataset, batch_size=128, shuffle=True, num_workers=4, pin_memory=True)

for inputs, labels in train_loader:
    inputs, labels = inputs.to(device), labels.to(device)

This way of loading data is very time-consuming. Any way to directly load data into GPU without transfer step ?

Khiem Le
  • 183
  • 1
  • 1
  • 4

3 Answers3

8

@PeterJulian first of all thanks for the reply. As far as I know there is no single line command for loading a whole dataset to GPU. Actually in my reply I meant to use .to(device) in the __init__ of the data loader. There are some examples in the link that I had shared previously. Also, I left an example data loader code below. Hope both the examples in the link and the code below helps.

class SampleDataset(Dataset):
    def __init__(self, device='cuda'):
        super(SampleDataset, self).__init__()
        self.data = torch.ones(1000)
        self.data = self.data.to(device)
    
    def __len__(self):
        return len(self.data)

    def __getitem__(self, i):
        element = self.data[i]
        return element
yilmazdoga
  • 400
  • 1
  • 6
  • 10
4

You can load all the data to in tensor than move it yo GPU memory.(assuming that you have enough memory) When you need it use the one inside the tensor which is already at GPU memory. Hope it helps.

yilmazdoga
  • 400
  • 1
  • 6
  • 10
  • @PeterJulian After preparing the tensor which contains your data, you can move that to the GPU using `your_data = your_data.to(device)` You can find some examples and details [here](https://discuss.pytorch.org/t/load-entire-dataset-on-gpu/79165). – yilmazdoga Aug 27 '21 at 07:17
  • 1
    Thanks, I know that you can load tensors to the device with that. I meant, is there any command to load the whole data set to the GPU such that you do not have to call to(device) on every batch. Not sure how expensive that is but it's always an operation from CPU to GPU and this may be noticeable in compute, esp. on smaller networks/data sets. – Peter Julian Aug 29 '21 at 10:44
1

You cant just load training data into tensor directly for many deep learning problems. Many times you need to use DataLoader to generate new augmentated data while training is ongoing with other free multiprocessors.

That is the advantage of DataLoader. Via setting your transform arguments strategically, your training data differs slightly every epoch, assisting with regularization.

Of course if comprehensive Data augmentation while training is on the round isn't needed and you have enough memory to load the entire static training data as one tensor object, then that would definitely fix your problem.

just set tensorObject.to(device), but you lose the benefits of dataloaders.

gway
  • 31
  • 3