Transform (project? warp?) each 2D point of a photo of a sphere (of known radius/distance) to its corresponding 3D coordinates on the (hemi)sphere

Question

I’m working in OpenCV to take two timed images of a golf ball in flight and determine the 3-axis spin velocities of the ball based on the differences between the two images as exhibited by their dimple patterns (which are irregular enough to allow this to work). I’ve already filtered the photos to more easily identify the dimple features (as seen here, though some portions will be ignored). I know the radius and distance of the balls from the camera, so there should be a unique 3D coordinate for every pixel in my 2D photo, as each pixel should be able to be orthogonally projected until it intersects where the surface of the ball would be in space.

My problem now is how to project the 2D points of my calibrated camera image out to where they would have been on the golf ball. If “project” is even the correct term–I’m pretty new to machine vision. ;) As above, I believe each point should be able to be projected to a corresponding real-world coordinate on the hemisphere that is facing the camera. It would be even better if I could just project the entire image in one operation, not just pixel-by-pixel. Does anyone have code or an algorithm, or even a gentle nudge in the right direction to make that happen?

Ultimately, once I have the 3D coordinates of the dimples or their centers, I want to rotate them in 3D space by various small euler angles to various candidate positions, then un-project those points back onto a 2D image (as each would look to the camera, and ignoring any rotation around the back of the ball). So hopefully this process is reversable Then, I could compare the second golf ball image to each of those candidate images to see which is closest and use the corresponding rotation angles to determine the spin. I’m basically trying to replicate the great work of these researchers in this article–-(https://www.researchgate.net/publication/313541573_Estimation_of_a_large_relative_rotation_between_two_images_of_a_fast_spinning_marker-less_golf_ball). The authors no longer have the code.

I’d appreciate any direction on this. I’ve started looking at the OpenCV sphericalWarper, and have also tried simpler registration techniques but I don’t think that’s going to work.

I’m aware of (Transforming 2D image coordinates to 3D world coordinates with z = 0) and (Computing x,y coordinate (3D) from image point). Also aware of questions involving stereo cameras and 3D to 2D projection. Thank you!

Is it clear for you how to calculate the x and y coordinate in 3D of each pixel and you pnly need the z coordinate? Depending on your needed accuracy you could use a parallel projection as a quick and dirty estimation, where you map the image circle center to the 3D sphere center and scale your image so that the circle radius fits the sphere radius. Then, for each pixel, you have to calculate the z coordinate. — Micka, May 12 '23 at 06:00
The accurate way should be to construct the 3D rays from camera center through the pixel and calculate the 3D point where it intersects the sphere. — Micka, May 12 '23 at 06:01
Thanks Micka - I'm not even at the point of calculating the x,y in 3D yet. I'm a total noob. Heck - it took me 2 days just to get the Gabor filters working for the dimple identification (other techniques like canny and loose Hough circles were not reliable)! I'm just slowly learning OpenCV by pushing through this golf project. If there's a good book on this, however, I'm happy to start reading. :) — jpilgrim, May 12 '23 at 14:18
For example if you ball image and s 100x100 pixel and the ball is centered with a radius of 50 pixels. If your 3D sphere is centered at (0,0,0) with a radius of N mm in 3D, then you could map pixel (50,0) (top center) to 3D (0,N,0) (top center). Image pixel (50,50) (front center) would map to (0,0,N), and so on. I think it will be easy to understand if you draw it as a parallel projection. — Micka, May 12 '23 at 14:29
Oh - ok. Got it. That's easy and the reverse is easy too. I just figured there was some quick way in openCV that I was missing. Hopefully I can multi-thread those calculations to parallelize the work. Although this is eventually going to a RPi, so probably won't have many cores to work with anyway.Vielen Dank. — jpilgrim, May 12 '23 at 15:36
the two left images you show appear to have *significant* motion between them, more than can be tracked between adjacent frames. — Christoph Rackwitz, May 12 '23 at 16:46
It does seem non-intuitive that such a large rotation (45 degrees in this test case) could still be tracked correctly. But, that's exactly the result of the research paper that I note above. Check it out - it's pretty cool. :) — jpilgrim, May 12 '23 at 18:53
interesting. RG said "To read the full-text of this research, you can request a copy directly from the authors." (lmao) but DOI links to https://ieeexplore.ieee.org/document/7844057 which has full text. — Christoph Rackwitz, May 12 '23 at 20:09
just implement the warp yourself. use python. use np.mgrid or np.meshgrid, define the result grid, transform it step by step (subtract center, calculate Z for assumed sphere surface, then apply inverse rotation, then drop Z again), which gives you something suitable for `cv.remap()` (remap pulls image data into destination grid, from source, using the map for lookup). or don't remap at all but apply this (except not inverted) to individual center points, once you have them. — Christoph Rackwitz, May 12 '23 at 20:12

score 0 · Answer 1 · answered May 12 '23 at 15:35

You do not need to backproject pixels into to find the ball's location. Golf balls have standard sizes and are approximately spherical, right? So accurate segmentation of the ball in the image can be translated into distance of the ball from the camera - you simply solve (optimize) for the ball's center location in 3D space that minimizes the distance in the image between the observed edge of the ball's image and the projection of a sphere at that center. Once you have the ball's location, you can directly translate observed (pixel) motion of the dimples into motion on the surface of the ball, hence velocity.

score 0 · Answer 2 · answered May 21 '23 at 20:41

The direction from @Micka was very helpful. Per that advice, the algorithm I'm using currently projects the 2D image onto a 3D hemisphere by constructing rays from the camera to a hemisphere representing the golf ball (using the known radius of the ball). Then, I use the following transforms to rotate/shift that hemisphere into dozens of potential "pose" candidates of various x, y, and z rotation combinations, including setting some flags to ignore anything that rotates around from behind the ball where the camera has no data:

   // X-axis rotation
    if (rotatingOnX_) {
        double tmpImageYFromCenter = imageYFromCenter;  // Want to change both Y and Z at the same time
        imageYFromCenter = (imageYFromCenter * cosX_) - (imageZ * sinX_);
        imageZ = (int)((tmpImageYFromCenter * sinX_) + (imageZ * cosX_));
    }
    
    // Y-axis rotation
    if (rotatingOnY_) {
        double tmpImageXFromCenter = imageXFromCenter;
        imageXFromCenter = (imageXFromCenter * cosY_) + (imageZ * sinY_);
        imageZ = (int)((imageZ * cosY_) - (tmpImageXFromCenter * sinY_));
    }

    // Z-axis rotation
    if (rotatingOnZ_) {
        double tmpImageXFromCenter = imageXFromCenter;
        imageXFromCenter = (imageXFromCenter * cosZ_) - (imageYFromCenter * sinZ_);
        imageYFromCenter = (tmpImageXFromCenter * sinZ_) + (imageYFromCenter * cosZ_);
    }

Next, I “unproject” the shifted 3D result back to what the camera would see if it were looking at the shifted 3D image.

The main problem here is the same problem that some of the OpenCV warpers try to solve. The rotated 3D image has gaps where there’s no data (near the edges of the original, unrotated sphere image) and that leaves gaps in the final 2D unprojected image post-rotation. I’m thinking about doing some interpolation to fill some of those gaps in, but they don’t seem to be causing too much trouble yet.

So, here’s the results. First, I start with a picture of a golf ball like this:

Then, identify and isolate the ball (see image below, initial ball on top-left) and perform a multi-angle Gabor filter on each of the two images to pick up the edges and threshold the results (right column):

After computing several hundred candidate images of the first ball rotated in 3D (and then un-projecting each back to 2D to compare to the second image), the angle associated with the best-matched candidate (in degrees) was:

[2023-05-21 11:24:28.936943] (0x0000c5a4) [debug] Best Rotation Candidate was #13470 - Rot: (19, 5, 28) <=-- those are the X, Y and Z angles.

The best-fitting rotation candidate image (with some unwanted data ignored) was:

Finally, I used the 3 calculated Euler angles to rotate the original ball image to see how it compares to the second golf ball. The results are far from exact, but I think it’s not bad for a first try… See below

Note that some of the area “behind” the hemisphere is rotated around and just shows up as black in the simulated image. A second camera that could “see” some of the backside of the sphere might be helpful here.

Another question was how this approach would perform for relatively large angle displacements without any ball features at all (one commenter was pretty pessimistic). Turns out, the algorithm handles this pretty well! For example, the images below are the first picture, the second picture, and the first picture as rotated by the estimated, discovered angles:

[2023-05-21 14:13:58.153071] (0x0000c8a4) [debug] Best Rotation Candidate was #178 - Rot: (-2, 2, 54)

If I measure just the Z-axis rotation angle of the images shown above and ignore any X and Y rotations, the measured angle (via an on-screen protractor, see below) is about 52 degrees compared to the calculated 54 degrees.

Still work to do, but I’m optimistic that this will ultimately converge to a decent solution with some more work. Thanks for the help, all!

Transform (project? warp?) each 2D point of a photo of a sphere (of known radius/distance) to its corresponding 3D coordinates on the (hemi)sphere

2 Answers2