Questions tagged [data-augmentation]

Data augmentation

Data augmentation is the technique of increasing the size of data used for training a model. It also helps in preventing overfitting.

Some of the most common data augmentation techniques for images include:

  • Scaling
  • Cropping
  • Flipping
  • Rotation
  • Color jittering
465 questions
91
votes
6 answers

Data Augmentation in PyTorch

I am a little bit confused about the data augmentation performed in PyTorch. Now, as far as I know, when we are performing data augmentation, we are KEEPING our original dataset, and then adding other versions of it (Flipping, Cropping...etc). But…
Fawaz
  • 1,253
  • 2
  • 11
  • 9
22
votes
1 answer

PyTorch transforms on TensorDataset

I'm using TensorDataset to create dataset from numpy arrays. # convert numpy arrays to pytorch tensors X_train = torch.stack([torch.from_numpy(np.array(i)) for i in X_train]) y_train = torch.stack([torch.from_numpy(np.array(i)) for i in y_train]) #…
kHarshit
  • 11,362
  • 10
  • 52
  • 71
16
votes
1 answer

How to apply data augmentation in TensorFlow 2.0 after tfds.load()

I'm following this guide. It shows how to download datasets from the new TensorFlow Datasets using tfds.load() method: import tensorflow_datasets as tfds SPLIT_WEIGHTS = (8, 1, 1) splits =…
15
votes
1 answer

What exactly the shear do in ImageDataGenerator of Keras?

I cant understand what is the effect of shear parameter in ImageDataGenerator of keras I had tried to use an image to apply the shear by apply_transform member function in ImageDataGenerator. I can see the image seems to be rotated and stretched out…
Ray Xie
  • 171
  • 1
  • 1
  • 8
12
votes
2 answers

Saving model on Tensorflow 2.7.0 with data augmentation layer

I am getting an error when trying to save a model with data augmentation layers with Tensorflow version 2.7.0. Here is the code of data augmentation: input_shape_rgb = (img_height, img_width, 3) data_augmentation_rgb = tf.keras.Sequential( [ …
moumed
  • 135
  • 1
  • 6
11
votes
4 answers

Tensorflow Data Augmentation gives a warning: Using a while_loop for converting

I use the data augmentation according to the official TensorFlow tutorial. First, I create a sequential model with augmenting layers: def _getAugmentationFunction(self): if not self.augmentation: return None pipeline = [] …
Karol Borkowski
  • 532
  • 7
  • 19
11
votes
4 answers

How to use different data augmentation for Subsets in PyTorch

How to use different data augmentation (transforms) for different Subsets in PyTorch? For instance: train, test = torch.utils.data.random_split(dataset, [80000, 2000]) train and test will have the same transforms as dataset. How to use custom…
Fábio Perez
  • 23,850
  • 22
  • 76
  • 100
10
votes
1 answer

Keras iterator with augmented images and other features

Say you have a dataset that has images and some data in a .csv for each image. Your goal is to create a NN that has a convolution branch and another one (in my case an MLP). Now, there are plenty of guides (one here, another one) on how to create…
Lamberto Basti
  • 478
  • 1
  • 6
  • 24
8
votes
2 answers

How to Create Class Label for Mosaic Augmentation in Image Classification?

Update This is now officially supported by keras-cv. To create a class label in CutMix or MixUp type augmentation, we can use beta such as np.random.beta or scipy.stats.beta and do as follows for two labels: label = label_one*beta +…
Innat
  • 16,113
  • 6
  • 53
  • 101
7
votes
1 answer

Tensorflow object detection api: how to use imgaug for augmentation?

I've been hand-rolling augmenters using imgaug, as I really like some of the options that are not available in the tf object detection api. For instance, I use motion blur because so much of my data has fast-moving, blurry objects. How can I best…
eric
  • 7,142
  • 12
  • 72
  • 138
7
votes
1 answer

How to rotate images at different angles randomly in tensorflow

I know that I can rotate images in tensorflow using tf.contrib.image.rotate. But suppose I want to apply the rotation randomly at an angle between -0.3 and 0.3 in radians as follows: images = tf.contrib.image.rotate(images,…
I. A
  • 2,252
  • 26
  • 65
6
votes
3 answers

Randomly Generate Synthetic Noise in an Image Text Document

I'm working on denoising dirty image document. I want to create a dataset wherein synthetic noise will be added to simulate real-world, messy artifacts. Simulated dirt may include coffee stains, faded sun spots, dog-eared pages, lot of wrinkles and…
6
votes
1 answer

TensorFlow Object Detection API: specifying multiple data_augmentation_options

I'm wondering if there's any difference between specifying the data augmentations like this: data_augmentation_options { random_horizontal_flip { } } data_augmentation_options { ssd_random_crop { } } Or like this: data_augmentation_options…
jvlier
  • 77
  • 6
6
votes
1 answer

Python Google Translate API error : How to translate a large amount of data

My problem I would like to use a kind of data-augmentation method for NLP consisting of back-translating dataset. Basically, I have a large dataset (SNLI), consisting of 1 100 000 english sentences. What I need to do is : translate these sentences…
Astariul
  • 2,190
  • 4
  • 24
  • 41
5
votes
4 answers

What is the most efficient way to read and augment (copy samples and change some values) large dataset in .csv

Currently, I have managed to solve this but it is slower than what I need. It takes approximately: 1 hour for 500k samples, the entire dataset is ~100M samples, which requires ~200hours for 100M samples. Hardware/Software specs: RAM 8GB, Windows 11…
AKΛ
  • 53
  • 3
1
2 3
30 31