4

I have a large quantity of pixel colors (96 thousands different colors):

enter image description here

And I want to get some kind of a mathematically-defined probability region like in this question:

enter image description here

The main obstacle I see right now – all methods on Google are mainly about visualisations and about two-dimensional spaces, yet there is no algorithm for finding coefficients of an equation like:

a1x2 + b1y2 + c1y2 + a2xy + b2xz + c2yz + a3x + b3y + c3z = 0

And this paper is too difficult for me to implement it in python. :(

Anyway, what I just want is to determine if some pixel is more-or-less lies within the diapason I have.

I tried making it using scikit clustering, but I failed due to having only one set of data, probably. And creating an array 2563 elements representing each pixel color seems a wrong way.

I wonder if there is an easy way to determine boundaries of this point cluster? Or, maybe I'm just overthinking it and there is something like OpenCV cv2.inRange() function?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Xobotun
  • 1,121
  • 1
  • 18
  • 29
  • 1
    Look into 3 variable [principle component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis). The eigenvectors will give you the main axes of your ellipse. – samgak Mar 01 '17 at 22:51
  • The common thing that *looks* like what you're trying to do is called "principal component analysis". From 3 dimensional data it produces 3 orthogonal vectors (the principal components) that summarize the variance of the data in all directions. They are the axes of an ellipse that looks like the one above. – Matt Timmermans Mar 01 '17 at 22:52
  • @samgak, MattTimmermans, [PCA](http://sebastianraschka.com/Articles/2014_pca_step_by_step.html#generating-some-3-dimensional-sample-data) worked almost flawlessly. I only failed to draw these eigenvectors on the plot, but they seem to be correct: http://imgur.com/NmzgdAP – Xobotun Mar 02 '17 at 10:09
  • 1
    Also answers under this similar [question](https://stackoverflow.com/q/7272252/10141885) may help. – halt9k Sep 02 '22 at 19:01

2 Answers2

2

this can be solved by optimization and fitting of the ellipsoid polynomial. However I would start with geometrical approach which is much faster:

  1. find avg point position

    that will be the center of your ellipsoid

    p0 = sum (p[i]) / n      // average
    i = { 0,1,2,3,...,n-1 }  // of all points
    

    If your point density is not homogenuous then it is safer to use bounding box center instead. So find xmin,ymin,zmin,xmax,ymax,zmax and the middle between them is your center.

  2. find most distant point to center

    that will give you main semi axis

    pa = p[j];
    |p[j]-p0| >= |p[i]-p0|   // max
    i = { 0,1,2,3,...,n-1 }  // of all points
    
  3. find second semi-axises

    so vector pa-p0 is normal to plane in which the other semi-axises should be. So find most distant point to p0 from that plane:

    pb = p[j];  
    |p[j]-p0| >= |p[i]-p0|   // max
    dot(pa-p0,p[j]-p0) == 0  // but inly if inside plane
    i = { 0,1,2,3,...,n-1 }  // from all points
    

    beware that the result of dot product may not be precisely zero so it is better to test against something like this:

    |dot(pa-p0,p[j]-p0)| <= 1e-3
    

    You can use any threshold you want (should be based on the ellipsoid size).

  4. find last semi-axis

    So we know that last semi-axis should be perpendicular to both

    (pa-p0) AND (pb-p0)
    

    So find point such that:

    pc = p[j];  
    |p[j]-p0| >= |p[i]-p0|   // max
    dot(pa-p0,p[j]-p0) == 0  // but inly if inside plane
    dot(pb-p0,p[j]-p0) == 0  // and perpendicular also to b semi-axis
    i = { 0,1,2,3,...,n-1 }  // from all points
    
  5. Ellipsoid

    Now you have all the parameters you need to form your ellipsoid. vectors

    (pa-p0),(pb-p0),(pc-p0)
    

    are the basis vectors of your ellipsoid (you can make them perpendicular by using cross product). Their size gives you the radiuses. And p0 is the center. You can also use this parametric equation:

    a=pa-p0;
    b=pb-p0;
    c=pc-p0;
    p(u,v) = p0 + a*cos(u)*cos(v)
                + b*cos(u)*sin(v)
                + c*sin(u);
    u = < -0.5*PI , +0.5*PI >
    v = < 0.0 , 2.0*PI >
    

This whole process is just O(n) and the results can be used as start point for both optimization and fitting to speed them up without the loss of accuracy. If you want to further improve accuracy See:

The sub links shows you examples of fitting ...

You can also take a look at this:

which is basically similar to your task but only in 2D still may bring you some ideas.

Community
  • 1
  • 1
Spektre
  • 49,595
  • 11
  • 110
  • 380
  • Oh, I have never thought it is possible to circumcircle an ellipsoid that easilly (and with O(n))! But it seems I have some outlier data and pure geometrical approach will not return the wanted result. Maybe approximation search will help me, but I an not sure if I will be able to implement it. :) But your fifth paragraph is truly remarkable! I will use these eigenvectors I got from PCA and then build an ellipsoid around p0. It seems not too difficult to check if another pixel lies within the ellipsoid, even though there is a rotation. Thank you very much! :) I'm really bad at maths... – Xobotun Mar 02 '17 at 10:32
  • @Xobotun you can bend the search equations a bit to suite your need more closer. You did not specify the exact properties of your ellipsoid so I show just geometric approach. For example if you want that your ellipsoid covers 75% of the points you can transform all the points to coordinate system of found ellipsoid and rescale to cube. then shrink the bbox until it covers only 75% of points, rescale back and recompute a,b,c size again ... You can do anything (add equations change the dot product threshold angle etc ... ) – Spektre Mar 02 '17 at 10:52
  • @Xobotun as you are mentioning colors then it implies you are doing some king of color quantization based on probabilities there are also other ways around that for example see: [Effective gif/image color quantization?](http://stackoverflow.com/a/30265253/2521214) But as I do not have background info about your problem I may be assuming in wrong direction ... – Spektre Mar 02 '17 at 10:56
  • I thought it would be better for StackOverflow to ask my question in more abstract terms than concrete ones. The reason I came up with this question is to extract pixels that look like my skin: http://i.imgur.com/oSUGLJH.png Our scientific adviser (simply teacher) in university gave some of us task to speed up Viola-Jones face detection algorithm by previously filtrating the image given. When I told him the idea of filtering regions that look like skin he said "Hmm. It might work.". And now I am trying to develop this idea and measure the time it takes to detect faces. – Xobotun Mar 02 '17 at 20:03
  • @Xobotun not an expert in the field but my guts are telling me HSV color space would be better then RGB for this. Also in HSV you can partially ignore V-value for this making your problem 2D instead of 3D see [HSV Histogram](http://stackoverflow.com/a/29286584/2521214) for some ideas. – Spektre Mar 02 '17 at 20:18
  • I thought about selecting ~100 farmost points from each of 6 directions and using their mean coordiante to specify semiaxes. I didn't say anything about properties of ellipsoid because I don't know whether my dataset represents skin colorspace fully (of course not) and if not then to which degree. So I'm just going to play with the coverage parameter as you have suggested. Even if I fail, I can always use equation like `x ± 10 = y ± 10 = z ± 10` because my dataset is very oblong along `x = y = z` line: http://i.imgur.com/Y10Akx6.png – Xobotun Mar 02 '17 at 20:22
  • HSV? Well, I knew that it exists, but I have never thought about using it. I'll be soon. :) Edit: It is no more an ellipsoid, but rather a [two planars](http://imgur.com/lSPBAzf)! Moving to another colorspace makes miracles: http://i.imgur.com/NCRt153.png I think I should postpone calculating an ellipsoid and read more about colors and generalization (and Machine Learning) algorithms. :) – Xobotun Mar 02 '17 at 20:39
0

Here is unstrict solution with fast and simple random search approach*. Best side - no heavy linear algebra library required**. Seems it worked fine for mesh collision detection.

Is assumes that ellipsoid center matches cloud center and then uses some sort of mirrored average to search for main axis.

Full working code is slightly bigger and placed on git, idea of main axis search is here:

np.random.shuffle(pts)

pts_len = len(pts)
pt_average =  np.sum(pts, axis = 0) / pts_len

vec_major = pt_average * 0
minor_max, major_max = 0, 0

# may be improved with overlapped pass, 
for pt_cur in pts:
    vec_cur = pt_cur - pt_average
    proj_len, rej_len = proj_length(vec_cur, vec_major)

    if proj_len < 0:
        vec_cur = -vec_cur
    vec_major += (vec_cur - vec_major) / pts_len

    major_max = max(major_max, abs(proj_len))
    minor_max = max(minor_max, rej_len)

It can be improved/optimized even more at some points. Examples what it will produce: 0

And full experiment code with plots

*i.e. adjusting code lines randomly until they work
**was actually reason to figure out this solution

halt9k
  • 527
  • 4
  • 13