I want to perform Hyperparameter Optimization on my Keras Model. The problem is the dataset is quite big, normally in training I use fit_generator
to load the data in batch from disk, but the common package like SKlearn Gridsearch, Talos, etc. only support fit
method.
I tried to load the whole data to memory, by using this:
train_generator = train_datagen.flow_from_directory(
original_dir,
target_size=(img_height, img_width),
batch_size=train_nb,
class_mode='categorical')
X_train,y_train = train_generator.next()
But the when performing gridsearch, the OS kills it because of large memory usage. I also tried to undersampling my dataset to only 25%, but it's still too big.
Anyone has experience in the same scenario with me? Can you please share your strategy to perform Hyperparameter Opimization for large dataset?
From the answer of @dennis-ec, I tried to follow a tutorial of SkOpt in here: http://slashtutorial.com/ai/tensorflow/19_hyper-parameters/ and it was a very comprehensive tutorial