How to split or divide my image and label in such a way that they can be used as features for machine learning?

Question

I am trying to perform pixel classification to perform segmentation of images using machine learning such as SVM, RandomForest etc.

I managed to get an acceptable result by using the grayscale values and RGB values of the image and associating each pixel with its ground truth. Avoiding to post the full code, here is how I made the feature and label array when using the full image:

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
feature_img = np.zeros((img.shape[0], img.shape[1], 4))     # container array, first three dimensions for the rgb values, and the last will hold the grayscale
feature_img[:, :, :3] = img
feature_img[:, :, 3] = img_gray
features = feature_img.reshape(feature_img.shape[0] * feature_img.shape[1], feature_img.shape[2])

gt_features = gt_img.reshape(gt_img.shape[0] * gt_img.shape[1], 1)

For an image of size 512*512 the above will give a features of shape [262144, 4] and an accompanying gt_feature of shape [262144, 1].

this gives me the x and y for sklearn.svm.SVC and like mentioned above this works well.. but the image is very noisy.. since SVM works well with higher dimensionality data I intend to explore that by splitting the image into windows.

Based on the above code, I wanted to split my image of size [512, 1024] into blocks of size [64*64] and use these for training the SVM. Following the above format, I wrote the below code to split my image into blocks and then .reshape() it into the required format for the classifier but its not working as expected:

win_size = 64
feature_img = blockshaped(img_gray, win_size, win_size)
feature_label = blockshaped(gt_img, win_size, win_size)

# above returns arrays of shape [128, 64, 64]

features = feature_img.reshape(feature_img.shape[1] * feature_img.shape[2], feature_img.shape[0])
# features is of shape [4096, 128]

 label_ = feature_label.reshape(feature_label.shape[0] * feature_label.shape[1] * feature_label.shape[2], 1)
# this, as expected returns ``[524288, 1]``

The function blockshaped is from the answer provided here: Slice 2d array into smaller 2d arrays

The reason i want to increase the dimensionality of my feature data is because it is known that SVM works well with higher dimension data and also want to see if block or patch based approach helps the segmentation result.

How would I go about arranging my data, that I have broken down into windows, in a form that can be used to train a classifier?

score 1 · Accepted Answer · answered Feb 08 '20 at 08:35

1

I've been thinking about your question for 5 hours and read some books to find the answer! your approach is completely wrong if you are doing segmentation! when we are using machine learning methods to segmentation we absolutely don't change any pixel place at all. not only in SVM but also in the neural network when we are approaching segmentation we don't use Pooling methods and even in CNN we use the Same padding to avoid pixel moving.

answered Feb 08 '20 at 08:35

parsa

370
4
13

Thanks. Thats what I thought as well, as thats how SVM and others classifiers are set up to work. I was just hoping that I might have missed something and it maye be possible to do this. – StuckInPhDNoMore Feb 10 '20 at 13:32
i caught a glimpse on the file. It's just for reducing the computational time. it's common when a picture is too big they cut it down to several images or windowed just for multi computing or checkpointing computation – parsa Feb 11 '20 at 13:57

How to split or divide my image and label in such a way that they can be used as features for machine learning?

1 Answers1