I'm trying to create an Image Classifier on a dataset with 40'000 images, in order to let Autokeras train the most appropriate model for me afterwards. Now the problem is, that every time I load all the images and get their labels but when I run the normalization Google Colab, there is a RAM overflow (although having a Pro+ account). Subsequently my code:
# Import TensorFlow
%tensorflow_version 2.x
import tensorflow as tf
import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import normalize, to_categorical
!pip install autokeras
import autokeras as ak
import glob
import numpy as np
images = glob.glob(path + '/*.png')
import random
data = []
labels = []
for i in images:
image=tf.keras.preprocessing.image.load_img(i, color_mode='rgb')
image=np.array(image, dtype ='float32')
image=cv2.resize(image, (180, 180))
image/=255.0
data.append(image)
label=os.path.basename(str(i.replace('.png', '')))
label=label.split()[0]
labels.append(label)
data = np.array(data)
labels = np.array(labels)
print(labels)
Up until here everything works like a charm, but then I face the problem which creates the overflow:
# normalize feature and encode label
X = data
y = np.zeros(labels.shape)
indices = np.unique(labels)
for i in range(labels.shape[0]):
y[i] = np.where(labels[i] == indices)[0]
y = to_categorical(y)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42) # this line seems to be the problem where Colab has a RAM overflow
clf= ak.ImageClassifier(overwrite=True, max_trials=20)
clf.fit(X_train, y_train, epochs=10) # and also this line seems to be the problem where Colab has a RAM overflow.
Does anyone know what the problem might be? A hint in any direction would be highly appreciated, as this makes me going crazy!