So i am a little confused about the offline / online Data Augmentation, or more so, the implementation of it.
I already checked these two posts
Online vs Offline data augmentation
My Dummy Implementation looks like this
class Dataset(torch.utils.data.Dataset):
def __init__(self, data_name, transform):
self.data_name = data_name
self.transform = transform
def __len__(self):
return 1
def __getitem__(self, index):
data = self.data_name[index]
img = Image.open(Path(data))
if self.transform:
img = self.transform(img)
return img
my_transformations = T.Compose([
T.RandomGrayscale(0.5),
T.RandomCrop(150),
T.Resize(400)
])
names = ["test\dog2.png", "test\dog3.png"]
dataset = Dataset(names, my_transformations )
#Display just for me to see for testing issues
for img in dataset:
display(img)
The way i see it, this is considered online augmentation since i am not adding to the dataset stored on my Disk(which i am not 100% Sure of, since i do have the originial pictures on the disk and now the same amount , but augmented in the Dataset), but rather take each picture at runtime, augment it and feed it into the dataset.
Then i came across this picture, which just confuses me, because according to this, i am doing offline augmentation. So if i am actually doing offline augmentation here, how would Online be imlemented?
Source: https://www.analyticsvidhya.com/blog/2021/06/offline-data-augmentation-for-multiple-images/