I am trying to perform pixel classification to perform segmentation of images using machine learning such as SVM, RandomForest
etc.
I managed to get an acceptable result by using the grayscale values and RGB values of the image and associating each pixel with its ground truth. Avoiding to post the full code, here is how I made the feature and label array when using the full image:
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
feature_img = np.zeros((img.shape[0], img.shape[1], 4)) # container array, first three dimensions for the rgb values, and the last will hold the grayscale
feature_img[:, :, :3] = img
feature_img[:, :, 3] = img_gray
features = feature_img.reshape(feature_img.shape[0] * feature_img.shape[1], feature_img.shape[2])
gt_features = gt_img.reshape(gt_img.shape[0] * gt_img.shape[1], 1)
For an image of size 512*512
the above will give a features of shape [262144, 4]
and an accompanying gt_feature of shape [262144, 1]
.
this gives me the x
and y
for sklearn.svm.SVC
and like mentioned above this works well.. but the image is very noisy.. since SVM works well with higher dimensionality data I intend to explore that by splitting the image into windows.
Based on the above code, I wanted to split my image of size [512, 1024]
into blocks of size [64*64]
and use these for training the SVM.
Following the above format, I wrote the below code to split my image into blocks and then .reshape()
it into the required format for the classifier but its not working as expected:
win_size = 64
feature_img = blockshaped(img_gray, win_size, win_size)
feature_label = blockshaped(gt_img, win_size, win_size)
# above returns arrays of shape [128, 64, 64]
features = feature_img.reshape(feature_img.shape[1] * feature_img.shape[2], feature_img.shape[0])
# features is of shape [4096, 128]
label_ = feature_label.reshape(feature_label.shape[0] * feature_label.shape[1] * feature_label.shape[2], 1)
# this, as expected returns ``[524288, 1]``
The function blockshaped
is from the answer provided here: Slice 2d array into smaller 2d arrays
The reason i want to increase the dimensionality of my feature data is because it is known that SVM works well with higher dimension data and also want to see if block or patch based approach helps the segmentation result.
How would I go about arranging my data, that I have broken down into windows, in a form that can be used to train a classifier?