How do I 'fit a line' to a cluster of pixels?

Question

I would like to generate a polynomial 'fit' to the cluster of colored pixels in the image here

(The point being that I would like to measure how much that cluster approximates an horizontal line). I thought of using grabit or something similar and then treating this as a cloud of points in a graph. But is there a quicker function to do so directly on the image file? thanks!

Do you have the underlying data or just a plot of it? Do the colours have some significance (like weighting) that needs to be considered? — Mark Setchell, Oct 14 '18 at 12:34
To add to what @MarkSetchell said, if you have the data that generated the plot, you can get the coordinates of the bright region then just fit a polynomial curve using least squares or fit a spline. — lightalchemist, Oct 14 '18 at 12:43
Thanks, I do not have the data. That's why I was thinking of grabbing it somehow. — magnolia1, Oct 14 '18 at 12:53
Regarding the color, if possible yes I'd use the intensity ('fire' scale here) to weigh the fit, but I'd be happy even with a binary image (not the case here) — magnolia1, Oct 14 '18 at 12:55
Given the irregularity of the outline, IMO more than a cubic would be overkill. — , Oct 14 '18 at 18:30
@YvesDaoust i'm a bit lost, you're saying a second order poly? Still, is there a way to fit it to the image that you know of? thanks — magnolia1, Oct 15 '18 at 07:30
@matteoeoeo: cubic is third degree. Use ordinary least-squares. In reality, it won't be much better than a simple linear model. — , Oct 15 '18 at 07:35
I suspect that this is a XY problem. What do you want to do, actually ? — , Oct 15 '18 at 07:35
I added a least squares with l2 regularization and a weighted least squares implementation up to arbitrary degree. — lightalchemist, Oct 15 '18 at 08:52

lightalchemist · Accepted Answer · 2018-10-15T14:10:33.647

Here is a Python implementation. Basically we find all (xi, yi) coordinates of the colored regions, then set up a regularized least squares system where the we want to find the vector of weights, (w0, ..., wd) such that yi = w0 + w1 xi + w2 xi^2 + ... + wd xi^d "as close as possible" in the least squares sense.

import numpy as np
import matplotlib.pyplot as plt

def rgb2gray(rgb):
    return np.dot(rgb[...,:3], [0.299, 0.587, 0.114])

def feature(x, order=3):
    """Generate polynomial feature of the form
    [1, x, x^2, ..., x^order] where x is the column of x-coordinates
    and 1 is the column of ones for the intercept.
    """
    x = x.reshape(-1, 1)
    return np.power(x, np.arange(order+1).reshape(1, -1)) 

I_orig = plt.imread("2Md7v.jpg")
# Convert to grayscale
I = rgb2gray(I_orig)

# Mask out region
mask = I > 20

# Get coordinates of pixels corresponding to marked region
X = np.argwhere(mask)

# Use the value as weights later
weights = I[mask] / float(I.max())
# Convert to diagonal matrix
W = np.diag(weights)

# Column indices
x = X[:, 1].reshape(-1, 1)
# Row indices to predict. Note origin is at top left corner
y = X[:, 0]

We want to find vector w that minimizes || Aw - y ||^2 so that we can use it to predict y = w . x

Here are 2 versions. One is a vanilla least squares with l2 regularization and the other is weighted least squares with l2 regularization.

# Ridge regression, i.e., least squares with l2 regularization. 
# Should probably use a more numerically stable implementation, 
# e.g., that in Scikit-Learn
# alpha is regularization parameter. Larger alpha => less flexible curve
alpha = 0.01

# Construct data matrix, A
order = 3
A = feature(x, order)
# w = inv (A^T A + alpha * I) A^T y
w_unweighted = np.linalg.pinv( A.T.dot(A) + alpha * np.eye(A.shape[1])).dot(A.T).dot(y)
# w = inv (A^T W A + alpha * I) A^T W y
w_weighted = np.linalg.pinv( A.T.dot(W).dot(A) + alpha * \
                             np.eye(A.shape[1])).dot(A.T).dot(W).dot(y)

The result

# Generate test points
n_samples = 50
x_test = np.linspace(0, I_orig.shape[1], n_samples)
X_test = feature(x_test, order)

# Predict y coordinates at test points
y_test_unweighted = X_test.dot(w_unweighted)
y_test_weighted = X_test.dot(w_weighted)

# Display
fig, ax = plt.subplots(1, 1, figsize=(10, 5))
ax.imshow(I_orig)
ax.plot(x_test, y_test_unweighted, color="green", marker='o', label="Unweighted")
ax.plot(x_test, y_test_weighted, color="blue", marker='x', label="Weighted")
fig.legend()
fig.savefig("curve.png")

For simple straight line fit, set the argument order of feature to 1. You can then use the gradient of the line to get a sense of how close it is to a horizontal line (e.g., by checking the angle of its slope).

It is also possible to set this to any degree of polynomial you want. I find that degree 3 looks pretty good. In this case, the 6 times the absolute value of the coefficient corresponding to x^3 (w_unweighted[3] or w_weighted[3]) is one measure of the curvature of the line.

See A measure for the curvature of a quadratic polynomial in Matlab for additional details.

thanks so much @lightalchemist can I ask also: can I then measure the fit line itself (length and length 'as the crow flies')? Would I need to re-extract it from the image? — magnolia1, Oct 15 '18 at 17:11
@matteoeoeo The weights of the polynomial are in `w_weighted` (or `w_unweighted`, depending on which you use). Given these weights you can compute the length of the curve using numerical methods. In practice I guess you want to 1) segment out the regions 2) fit the polynomial curve to each region then 3) measure the length. — lightalchemist, Oct 16 '18 at 00:40

How do I 'fit a line' to a cluster of pixels?

1 Answers1