3

I have a situation where I need to use ImageFolder with the albumentations lib to make the augmentations in pytorch - custom dataloader is not an option.

To this end, I am stumped and I am not able to get ImageFolder to work with albumenations. I have tried something along these lines:

class Transforms:
    def __init__(self, transforms: A.Compose):
        self.transforms = transforms

    def __call__(self, img, *args, **kwargs):
        return self.transforms(image=np.array(img))['image']

and then:

    trainset = datasets.ImageFolder(traindir,transform=Transforms(transforms=A.Resize(32 , 32)))

where traindir is some dir with images. I however get thrown a weird error:

RuntimeError: Given groups=1, weight of size [16, 3, 3, 3], expected input[1024, 32, 32, 3] to have 3 channels, but got 32 channels instead

and I cant seem to find a reproducible example to make a simple aug pipleline work with imagefolder.

UPDATE On the recommendation of @Shai, I have done this now:

class Transforms:
    def __init__(self):
        self.transforms = A.Compose([A.Resize(224,224),ToTensorV2()])

    def __call__(self, img, *args, **kwargs):
        return self.transforms(image=np.array(img))['image']
trainset = datasets.ImageFolder(traindir,transform=Transforms())

but I get thrown:

    self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.cuda.ByteTensor) and weight type (torch.cuda.FloatTensor) should be the same
Shai
  • 111,146
  • 38
  • 238
  • 371
AJW
  • 5,569
  • 10
  • 44
  • 57
  • it seems like you are missing the finale `ToTensor()` transform that permutes the dimensions from `h`x`w`x`c` to `c`x`h`x`w` – Shai Sep 12 '21 at 11:42
  • hmm I am not sure how I can pass the `toTensor()` to this :( if I pass a list I get an error saying list is not callable. :( – AJW Sep 12 '21 at 11:57
  • oyu should "compose" the `ToTensor()` after the resize transformation. – Shai Sep 12 '21 at 12:02
  • @Shai: I have done as you suggested, but it seems to thrown another error :( I have posted it as an update to the quetsion. – AJW Sep 12 '21 at 12:08

2 Answers2

2

You need to use ToTensorV2 transformation as the final one:

trainset = datasets.ImageFolder(traindir,transform=Transforms(transforms=A.Compose([A.Resize(32 , 32), ToTensorV2()]))
Shai
  • 111,146
  • 38
  • 238
  • 371
  • I get with this ^ approach the same error as I posted on the update `RuntimeError: Input type (torch.cuda.ByteTensor) and weight type (torch.cuda.FloatTensor) should be the same` – AJW Sep 12 '21 at 12:12
  • @AJW we got the dimensions right, now we need the data type. How about `A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),` before `ToTensorV2`? – Shai Sep 12 '21 at 12:17
  • I have one follow up quetsion: so, when albumnetatons is loaded via ImageFolder, is it using PIL or openCV to load the image? I ask because openCV uses BGR and PIL uses RGB... and I was wondering if I need to do anything re this. but since it is ImageFolder, I assume it is PIL? i.e I just want to be sure that albumentations is not using openCV under the hood somehow (I dont think so, but worth checking with you I thought!) – AJW Sep 12 '21 at 12:36
  • 1
    @AJW AFAIK, using `ImageFolder` invokes the default `image_loader` (that you can change). Only after the image was loaded, it is passed to the `transformations`. So, if you haven't changed the default loader - you are using `PIL.Image` and RGB images. – Shai Sep 12 '21 at 12:43
  • 1
    thank you EVER so much for all your explanations. – AJW Sep 12 '21 at 13:53
1

By looking into ImageFolder implementation on PyTorch[link] and some proposed work in Kaggle [link]. I propose the following solution (which is successfully tested from my side):

import numpy as np
from typing import Any, Callable, Optional, Tuple
from torchvision.datasets.folder import DatasetFolder, default_loader, IMG_EXTENSIONS
class CustomImageFolder(DatasetFolder):
def __init__(
    self,
    root: str,
    transform: Optional[Callable] = None,
    target_transform: Optional[Callable] = None,
    loader: Callable[[str], Any] = default_loader,
    is_valid_file: Optional[Callable[[str], bool]] = None,
):
    super().__init__(
        root,
        loader,
        IMG_EXTENSIONS if is_valid_file is None else None,
        transform=transform,
        target_transform=target_transform,
        is_valid_file=is_valid_file,
    )
    self.imgs = self.samples

def __getitem__(self, index: int) -> Tuple[Any, Any]:
    """
    Args:
        index (int): Index

    Returns:
        tuple: (sample, target) where target is class_index of the target class.
    """
    path, target = self.samples[index]
    sample = self.loader(path)
    if self.transform is not None:
        try:
            sample = self.transform(sample)
        except Exception:
            sample = self.transform(image=np.array(sample))["image"]
    if self.target_transform is not None:
        target = self.target_transform(target)

    return sample, target

def __len__(self) -> int:
    return len(self.samples)

Now you can run the code as follows:

trainset = CustomImageFolder(traindir,transform=Transforms(transforms=A.Resize(32 , 32)))