I have the following problem that I am working on:
-I have to create a CNN which takes as input a 3D image and outputs 4 classes(details below) -All 4 labels must be either 0 or 1: True or False depending on the input image
e.g. of an output:
[0, 1, 0, 1]
: this means that my prediction is that classes 2 and 4 are good for that image(the application is not relevant).Thus being said, I have a tensor of labels of the form [X,4] where X is the number of samples(or images).
The problem I am facing right now is a huge class imbalance(e.g. for the 3rd class almost 98% of the cases are 1s and only 2% are 0s). I have no idea how to solve this issue? I tried to google it for some good hours but no answer at all. I used class weighting(from sklearn) before but it seems that I cannot use it this time too.
The problem I observed using the class weighting is that it will weight each array of the input(i.e. 'what is the weighting of
[0,1,1,0]
in the entire label matrix') which is obviously not desirable. I want for each class to have one weight for 0s and one weight for 1s(which sums up to 8 weights).I've seen someone who tried to do this before and I manually created a function which calculate the weights and outputs the probability of either 0 or 1 for each class(e.g. class1 weight0 and class1 weight1).
Following on, I must create a dictionary of the weights. E.g. for a single-label classification:
{0: 0.9210526315789473, 1: 1.09375}
. I need this to be an argument in my model.fit() function.Obviously, I cannot create a dictionary which takes 4 different keys of 0 and 4 keys of 1. What should I do from here??
My first idea was to change the numbers in the labels in the following way: 1st class : 0=False; 1=True 2nd class : 2=False; 3=True 3rd class : 4=False; 5=True 4th class : 6=False; 7=True
Basically I just added up some multiples of 2 for each label and now each row of my label matrix has elements between 0 and 7.
I was able to create the dictionary in the form of
{0:w0;1:w1;2:w2,3:w3...}
which seemed like a good idea to me.Than I faced one more issue: when I fitted my model, the predictions were in range
(0,1)
because I was using a sigmoid activation function on the last neuron(i.e.Dense(4,activation='sigmoid')
). I have never worked before with numbers which are not between 0 and 1 but it kind of make sense for me to change the activation function from sigmoid to linear.My dictonary of weights at this point looks like this:
{0: 0.8714285714285714, 1: 0.12857142857142856, 2: 0.5428571428571428, 3: 0.45714285714285713, 4: 0.02857142857142857, 5: 0.9714285714285714, 6: 0.8142857142857143, 7: 0.18571428571428572}
where again, e.g. 6: represents the weight of the 4th class to be 0 or 1: represents the weight of the 1st class to be 1 and so forth.
With all of this done, my model is still acting weird. The outputs are not quite what is expected (e.g. a value between 0 and 1 for the 1st class, a value between 2 and 3 for the 2nd class and so forth). The accuracy is not stable, it varies a lot and the validation accuracy just jumps between 0 and 1?
This is how an output looks like right now:
array([[ 0.2878278, 1.3507844, -1.563219 , 0.5500042]]
Which, obviously, is totally wrong.
I will attach the code with the model and the function that I am using to compute the weights(I know it is nested and not using any vectorisation, but it was designed just for testing purposes only).
I really hope anyone can help me in diagnosing this problem, either being able to predict the rights values for each class or compute weights in a different manner.
CNN:
from tensorflow.keras import datasets, layers, models
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.initializers import RandomNormal
from tensorflow.keras.regularizers import l2
callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)
model=models.Sequential();
model.add(layers.Conv3D(16, (2,2,2) , kernel_regularizer=l2(0.01), strides= (1,1,1),input_shape=images['06S'].shape))
model.add(layers.MaxPooling3D(pool_size=(2,2,2),strides=(1,1,1)))
model.add(BatchNormalization(epsilon=1e-01,momentum=0.65))
model.add(tf.keras.layers.LeakyReLU(alpha=0.8))
model.add(layers.Dropout(0.7))
model.add(layers.Conv3D(8, (2,2,2) , kernel_regularizer=l2(0.01), strides=(1,1,1)))
model.add(layers.MaxPooling3D(pool_size=(2,2,2),strides=(1,1,1)))
model.add(BatchNormalization(epsilon=1e-01,momentum=0.65))
model.add(tf.keras.layers.LeakyReLU(alpha=0.8))
model.add(layers.Dropout(0.7))
model.add(layers.Conv3D(4, (2,2,2) , kernel_regularizer=l2(0.01), strides=(1,1,1)))
model.add(layers.MaxPooling3D(pool_size=(2,2,2),strides=(1,1,1)))
model.add(BatchNormalization(epsilon=1e-01,momentum=0.65))
model.add(tf.keras.layers.LeakyReLU(alpha=0.8))
model.add(layers.Dropout(0.7))
model.add(layers.Conv3D(16, (3,3,3) , kernel_regularizer=l2(0.01),strides=(1,1,1)))
model.add(layers.MaxPooling3D(pool_size=(3,3,3),strides=(1,1,1)))
model.add(BatchNormalization(epsilon=1e-01,momentum=0.65))
model.add(tf.keras.layers.LeakyReLU(alpha=0.8))
model.add(layers.Dropout(0.7))
model.add(layers.Dense(32,activation=None))
model.add(BatchNormalization(epsilon=1e-04,momentum=0.1))
model.add(tf.keras.layers.LeakyReLU(alpha=0.4))
model.add(layers.Dropout(0.6))
model.add(layers.Dense(16,activation=None))
model.add(BatchNormalization(epsilon=1e-04,momentum=0.1))
model.add(tf.keras.layers.LeakyReLU(alpha=0.4))
model.add(layers.Dropout(0.6))
model.add(layers.Dense(4, activation='linear'))
model.summary()
model.compile(optimizer='adam',
loss='mse',
metrics=['accuracy'])
Compute weights:
def class_weighting(arr):
arr_np=np.array(arr)
for j in range (arr_np.shape[0]):
ones=0
zeros=0
for i in range (arr_np.shape[1]):
if(j==0):
if (arr[j][i] == 1):
ones+=1
else:
zeros+=1
PVI0=zeros/arr_np.shape[1];
PVI1=ones/arr_np.shape[1];
elif(j==1):
if (arr[j][i] == 1):
ones+=1
else:
zeros+=1
FIBRO0=zeros/arr_np.shape[1];
FIBRO1=ones/arr_np.shape[1];
elif(j==2):
if (arr[j][i] == 1):
ones+=1
else:
zeros+=1
ROTOR0=zeros/arr_np.shape[1];
ROTOR1=ones/arr_np.shape[1];
elif(j==3):
if (arr[j][i] == 1):
ones+=1
else:
zeros+=1
ROOF0=zeros/arr_np.shape[1];
ROOF1=ones/arr_np.shape[1];
return PVI0,PVI1,FIBRO0,FIBRO1,ROTOR0,ROTOR1,ROOF0,ROOF1
Fitting:
PVI0,PVI1,FIBRO0,FIBRO1,ROTOR0,ROTOR1,ROOF0,ROOF1=class_weighting(arr)
classWeight={0:(PVI0),1:(PVI1),2:(FIBRO0),3:(FIBRO1),4:(ROTOR0),5:(ROTOR1),6:(ROOF0),
7:(ROOF1)}
history=model.fit(train_dataset,epochs=10,validation_data=val_dataset,
class_weight=classWeight))