Image augmentation on deep learning training data

Question

I have a question about mean and standard deviation in image augmentation.

Are the two parameters recommended to be filled in?

If so, how could I know the number? Do I have to iterate through the data， also each channel of image, before the train to get it?

import albumentations as A
train_transform = A.Compose(
        [
            A.Resize(height=IMAGE_HEIGHT, width=IMAGE_WIDTH),
            A.ColorJitter(brightness=0.3, hue=0.3, p=0.3),
            A.Rotate(limit=5, p=1.0),
            # A.HorizontalFlip(p=0.3),
            # A.VerticalFlip(p=0.2),
            A.Normalize(
                mean=[0.0, 0.0, 0.0],# <-----------this parameter
                std=[1.0, 1.0, 1.0],# <-----------this parameter
                max_pixel_value=255.0,
            ),
            ToTensorV2(),
        ],
    )

Hi @Amnesie, [SO's how to ask page](https://stackoverflow.com/help/how-to-ask) has some valuable information that will help you presenting your question. — ndrwnaguib, Mar 14 '22 at 19:09

score 1 · Accepted Answer · answered Mar 14 '22 at 22:43

Yes it is strongly recommended to normalize your images in most of the cases, obviously you will face some situations that does not require normalization. The reason is to keep the values in a certain range. The output of the network, even if the network is 'big', is strongly influenced by the input data range. If you keep your input range out of control, your predictions will drastically change from one to another. Thus, the gradient would be out of control too and might make your training unefficient. I invite you to read this and that answers to have more details about the 'why' behind normalization and have a deeper understanding of the behaviours.

It is quite common to normalize images with imagenet mean & standard deviation : mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225]. Of course you could also consider, if your dataset is enough realistic, in a production context, to use its own mean and std instead of imagenet's.

Finally keep in mind those values since, once your model will be trained, you will still need to normalize any new image to achieve a good accuracy with your future inferences.

Thank you very much. Your answer totally cleared up my confusion! I also read those two links and they discussed exactly what I was thinking about and I got a lot out of it. Besides that I would like to ask if I want to calculate my own mean and std do I have to map all images pixelvalue to [0,1] first,right? — Amnesie, Mar 14 '22 at 23:29
You're almost there, [this notebook](https://colab.research.google.com/github/kozodoi/website/blob/master/_notebooks/2021-03-08-image-mean-std.ipynb) will give you the whole procedure to get the mean & std from your dataset. Cheers ! — Maxime D., Mar 15 '22 at 08:27

Image augmentation on deep learning training data

1 Answers1