4

spectrum

The input is a spectrum with colorful (sorry) vertical lines on a black background. Given the approximate x coordinate of that band (as marked by X), I want to find the width of that band.

I am unfamiliar with image processing. Please direct me to the correct method of image processing and a Python image processing package that can do the same.


I am thinking PIL, OpenCV gave me an impression of being overkill for this particular application.

What if I want to make this an expert system that can classify them in the future?

Janne Karila
  • 24,266
  • 6
  • 53
  • 94
Jesvin Jose
  • 22,498
  • 32
  • 109
  • 202

3 Answers3

3

I'll give a complete minimal working example (as suggested by sega_sai). I don't have access to your original image, but you'll see it doesn't really matter! The peak distributions found by the code below are:

Mean values at: 26.2840960523 80.8255092125

import Image
from scipy import *
from scipy.optimize import leastsq

# Load the picture with PIL, process if needed
pic         = asarray(Image.open("band2.png"))

# Average the pixel values along vertical axis
pic_avg     = pic.mean(axis=2)
projection  = pic_avg.sum(axis=0)

# Set the min value to zero for a nice fit
projection /= projection.mean()
projection -= projection.min()

# Fit function, two gaussians, adjust as needed
def fitfunc(p,x):
    return p[0]*exp(-(x-p[1])**2/(2.0*p[2]**2)) + \
        p[3]*exp(-(x-p[4])**2/(2.0*p[5]**2))
errfunc = lambda p, x, y: fitfunc(p,x)-y

# Use scipy to fit, p0 is inital guess
p0 = array([0,20,1,0,75,10])
X  = xrange(len(projection))
p1, success = leastsq(errfunc, p0, args=(X,projection))
Y = fitfunc(p1,X)

# Output the result
print "Mean values at: ", p1[1], p1[4]

# Plot the result
from pylab import *
subplot(211)
imshow(pic)
subplot(223)
plot(projection)
subplot(224)
plot(X,Y,'r',lw=5)
show()

enter image description here

Hooked
  • 84,485
  • 43
  • 192
  • 261
  • This is nice, but needs to be adapted to deal with multiple lines. – fraxel Mar 19 '12 at 16:34
  • @fraxel agreed, but that shouldn't be to hard. You can simply modify the fit function to take in a parameterized number of Gaussians. For many of these spectroscopic problems, you know in advance the number of peaks you are looking for. I tried to give a simple, easy to follow recipe, without to much generalization. Both of our methods suffer from a thresholding problem. You'll need to have some knowledge of the system at hand to get a good solution. In your example, what happens if there are two distinguishable signals, but they overlap? – Hooked Mar 19 '12 at 17:18
  • I tested it out in my data: a blue spectrum with 8 lines that stand out, and many (I counted ~15) others that are "background". The `projection` works perfectly but the `leastsq` did not (duh?). Please recommend another way (transform method?) to identify peaks and their widths. Is there any image normalization I should be aware of (normalizing intensity, normalizing projection peak etc)? – Jesvin Jose Mar 20 '12 at 06:17
  • And BTW, I made up **that** image in paint, since you refered to it as _"original image"_ . Sorry, but I am a total newb with image processing :-) – Jesvin Jose Mar 20 '12 at 06:24
  • @aitchnyu Here's what I would do, first check out our beta SE site: http://area51.stackexchange.com/proposals/1691/signal-processing. If you really have that many peaks and they are _not_ known in advance, you need a more robust approach. See how `fitfunc` is a function of two G's? You can make it a function of multiple G's and keep trying to fit more G's until the convergence levels off. For that many free parameters, you'll need a better guess at the initial \mu's. If you like, post an additional question with the _real data_ so we can work off that. We can only solve what's given! – Hooked Mar 20 '12 at 13:44
  • @aitchnyu If it's not clear, the above code will only fit _two_ peaks _by construction_. The advantage to this method over a thresholding method like fraxel's is that it can handle peaks that overlap. You'll need to decide which is right for your data set. – Hooked Mar 20 '12 at 13:46
  • So I have to adapt `fitfunc` for "n Gaussians" if I know there are n peaks. Yes, I will ask a more specific question. – Jesvin Jose Mar 20 '12 at 13:49
  • @aitchnyu If you do, please edit your question to include a link to the new question, that way readers can follow. – Hooked Mar 20 '12 at 13:49
  • could you help me with a new question I have? http://stackoverflow.com/questions/11009379/higher-sampling-for-images-projection/11009805#11009805 – Jesvin Jose Jun 13 '12 at 10:21
2

Below is a simple thresholding method to find the lines and their width, it should work quite reliably for any number of lines. The yellow and black image below was processed using this script, the red/black plot illustrates the found lines using parameters of threshold = 0.3, min_line_width = 5)

enter image description here

The script averages the rows of an image, and then determines the basic start and end positions of each line based on a threshold (which you can set between 0 and 1), and a minimum line width (in pixels). By using thresholding and minimum line width you can easily filter your input images to get the lines out of them. The first function find_lines returns all the lines in an image as a list of tuples containing the start, end, center, and width of each line. The second function find_closest_band_width is called with the specified x_position, and returns the width of the closest line to this position (assuming you want distance to centre for each line). As the lines are saturated (255 cut-off per channel), their cross-sections are not far from a uniform distribution, so I don't believe trying to fit any kind of distribution is really going to help too much, just unnecessarily complicates.

import Image, ImageStat

def find_lines(image_file, threshold, min_line_width):
    im = Image.open(image_file)
    width, height = im.size
    hist = []
    lines = []
    start = end = 0
    for x in xrange(width):
        column = im.crop((x, 0, x + 1, height))
        stat = ImageStat.Stat(column)
        ## normalises by 2 * 255 as in your example the colour is yellow
        ## if your images start using white lines change this to 3 * 255
        hist.append(sum(stat.sum) / (height * 2 * 255)) 

    for index, value in enumerate(hist):
        if value > threshold and end >= start:
            start = index
        if value < threshold and end < start:
            if index - start < min_line_width:
                start = 0
            else:
                end = index
                center = start + (end - start) / 2.0
                width = end - start
                lines.append((start, end, center, width))
    return lines

def find_closest_band_width(x_position, lines):
    distances = [((value[2] - x_position) ** 2) for value in lines]
    index = distances.index(min(distances))
    return lines[index][3]

## set your threshold, and min_line_width for finding lines
lines = find_lines("8IxWA_sample.png", 0.7, 4)
## sets x_position to 59th pixel
print 'width of nearest line:', find_closest_band_width(59, lines)
fraxel
  • 34,470
  • 11
  • 98
  • 102
1

I don't think that you need anything fancy for you particular task.

I would just use PIL + scipy. That should be enough.

Because you essentially need to take your image, make a 1D-projection of it and then fit a Gaussian or something like that to it. The information about the approximate location of the band should be used a first guess for the fitter.

sega_sai
  • 8,328
  • 1
  • 29
  • 38