0

So i am a little confused about the offline / online Data Augmentation, or more so, the implementation of it.

I already checked these two posts

Online vs Offline data augmentation

Data Augmentation in PyTorch

My Dummy Implementation looks like this

class Dataset(torch.utils.data.Dataset):
    def __init__(self, data_name, transform):
        self.data_name = data_name
        self.transform = transform

    def __len__(self):
        return 1

    def __getitem__(self, index):

        data = self.data_name[index]
        img = Image.open(Path(data))

        if self.transform:
            img = self.transform(img)

        return img

my_transformations = T.Compose([
    T.RandomGrayscale(0.5),
    T.RandomCrop(150),
    T.Resize(400)
])

names =  ["test\dog2.png", "test\dog3.png"]
dataset = Dataset(names, my_transformations )

#Display just for me to see for testing issues

for img in dataset:
    display(img)

The way i see it, this is considered online augmentation since i am not adding to the dataset stored on my Disk(which i am not 100% Sure of, since i do have the originial pictures on the disk and now the same amount , but augmented in the Dataset), but rather take each picture at runtime, augment it and feed it into the dataset.

Then i came across this picture, which just confuses me, because according to this, i am doing offline augmentation. So if i am actually doing offline augmentation here, how would Online be imlemented? enter image description here

Source: https://www.analyticsvidhya.com/blog/2021/06/offline-data-augmentation-for-multiple-images/

0 Answers0