0

nan pdf this is what I expect to get output I developed this python code to cluster the Gaussian mixture models for an image. It works fine with the image segmentation and it shows the GMM on the image histogram. However, there is something wrong with showing different distributions on the different clusters on the histogram. Thanks for helping.

import os
import matplotlib.pyplot as plt
import numpy as np
import cv2
from scipy import stats


img = cv2.imread("test.tif")

# Convert MxNx3 image into Kx3 where K=MxN
img2 = img.reshape((-1,3))  #-1 reshape means, in this case MxN

from sklearn.mixture import GaussianMixture as GMM

#covariance choices, full, tied, diag, spherical

k = 7
gmm_model = GMM(n_components=k, covariance_type='full').fit(img2)  #tied works better than full
gmm_labels = gmm_model.predict(img2)

#Put numbers back to original shape so we can reconstruct segmented image
original_shape = img.shape
segmented = gmm_labels.reshape(original_shape[0], original_shape[1])
cv2.imwrite("test_segmented.tif")

data = img2.ravel()
data = data[data != 0]
data = data[data != 1]  #Removes background pixels (intensities 0 and 1)
gmm = GMM(n_components = k)
gmm = gmm.fit(X=np.expand_dims(data,1))
gmm_x = np.linspace(0,255,256)
gmm_y = np.exp(gmm.score_samples(gmm_x.reshape(-1,1)))

gmm_model.means_

gmm_model.covariances_

gmm_model.weights_

# Plot histograms and gaussian curves
fig, ax = plt.subplots()
ax.hist(img2.ravel(),255,[2,256], density=True, stacked=True)
ax.plot(gmm_x, gmm_y, color="crimson", lw=2, label="GMM")

ax.set_ylabel("Frequency")
ax.set_xlabel("Pixel Intensity")

plt.legend()
plt.grid(False)
plt.xlim([0, 256])

plt.show()

for m in range(gmm_model.n_components):
    

    pdf = gmm_model.weights_[m] * stats.norm(gmm_model.means_[m, 0],
                                       np.sqrt(gmm_model.covariances_[m, 0])).pdf(gmm_x.reshape(-1,1))
    plt.fill(gmm_x, pdf, facecolor='gray',
             edgecolor='none')
plt.xlim(0, 256)
  • 1
    "something wrong" you say? please explain that. please review [ask] and [mre]. – Christoph Rackwitz Jun 21 '22 at 18:39
  • the last part of the code is supposed to show the different distributions of the GMM. However, it is not. @Christoph Rackwitz – Mohamed Hassan Jun 21 '22 at 21:05
  • "it is not" showing that... ok what does it do instead? don't expect people to run your code to find that out. – Christoph Rackwitz Jun 21 '22 at 22:05
  • Sure! Sorry for not sharing the output. I added the output I got in the post. I also added what I expect to get. – Mohamed Hassan Jun 22 '22 at 00:55
  • so it's drawing a triangle... show the data that produced that specific plot, i.e. what are the values in `gmm_x` and `gmm_y` -- the [ask] article specifically mentions debugging your own code. this is debugging. you don't need me to prompt you for every step. just investigate. and maybe find _something_ that shows you the basics of using the python debugger... or at least learn "[printlining](https://stackoverflow.com/questions/189562/what-is-the-proper-name-for-doing-debugging-by-adding-print-statements)" (printing values) – Christoph Rackwitz Jun 22 '22 at 07:09
  • the part for plotting `gmm_X` and `gmm_y` is working fine. the problem is with the other drawing which should be `gmm_X` and `pdf`. Thanks for such valuable information! I will add to the mail post the values for the `gmm_X`, `gmm_y` and `pdf`. – Mohamed Hassan Jun 22 '22 at 14:54
  • I would like to thank you. Now, I can ask the right question I think. The code was fine, I just needed to add the figure for `gmm_X` and `gmm_y` before the new figures. However, I got an error because the `pdf` has some values that are (nan) and these values are the reason of getting the triangle shape. I attached image "nan" for that plus you can find the runtimewarning in the image "pdf" – Mohamed Hassan Jun 22 '22 at 16:35

1 Answers1

0

I edited it and the missing part was to put the histogram before start plotting each cluster normalized curve. I hope it will be helpful!

import os
import matplotlib.pyplot as plt
import numpy as np
import cv2
from scipy import stats


img = cv2.imread("test.tif")

# Convert MxNx3 image into Kx3 where K=MxN
img2 = img.reshape((-1,3))  #-1 reshape means, in this case MxN

from sklearn.mixture import GaussianMixture as GMM

#covariance choices, full, tied, diag, spherical

k = 7
gmm_model = GMM(n_components=k, covariance_type='full').fit(img2)  #tied works better than full
gmm_labels = gmm_model.predict(img2)

#Put numbers back to original shape so we can reconstruct segmented image
original_shape = img.shape
segmented = gmm_labels.reshape(original_shape[0], original_shape[1])
cv2.imwrite("test_s.tif", segmented)

data = img2.ravel()
#data = data[data != 0]
#data = data[data != 1]  #Removes background pixels (intensities 0 and 1)

gmm = GMM(n_components = k)
gmm = gmm.fit(X=np.expand_dims(data,1))
gmm_x = np.linspace(0,255,256)
gmm_y = np.exp(gmm.score_samples(gmm_x.reshape(-1,1)))


gmm_model.means_

gmm_model.covariances_

gmm_model.weights_


# Plot histograms and gaussian curves
fig, ax = plt.subplots()
ax.hist(img2.ravel(),255,[2,256], density=True, stacked=True)
ax.plot(gmm_x, gmm_y, color="crimson", lw=2, label="GMM")

ax.set_ylabel("Frequency")
ax.set_xlabel("Pixel Intensity")

plt.legend()
plt.grid(False)
plt.xlim([0, 256])

plt.show()

for m in range(gmm_model.n_components):
    

    pdf = gmm_model.weights_[m] * stats.norm(gmm_model.means_[m, 0],
                                       np.sqrt(gmm_model.covariances_[m, 0])).pdf(gmm_x.reshape(-1,1))
    

    fig, ax = plt.subplots()
    ax.hist(img2.ravel(),255,[2,256], density=True, stacked=True)
    ax.plot(gmm_x, gmm_y, color="crimson", lw=2, label="GMM")
    plt.fill(gmm_x, pdf, facecolor='gray',
             edgecolor='none')
    plt.xlim(0, 256)
    plt.ylim(0, .06)