OpenCV: Getting 2D camera space coordinates of ArUco marker corners

Question

I'm struggling with getting 2D camera space coordinates from my ArUco markers' corners and am only getting weirdly large results. Working with OpenCV and Python.

What I have done so far:

Extracting 2D corner coordinates of my ArUco markers in their respective marker space with aruco.detectMarkers
getting rotation vector and translation vector of the current marker by using aruco.estimatePoseSingleMarkers
then using the Python solution from this question to get 3D corner coordinates
and then using cv2.projectPoints on the 3D corner coordinates to obtain the 2D projection (for projectPoints I'm using rotation and translation vector of the current marker)

What I get as a result are values like this:

[[-1333579392 -2147483648]
 [-2147483648 -2147483648]
 [-2147483648 -2147483648]
 [-1272630656 -2147483648]]

...which to me seem too large. I was hoping to draw those points on the webcam livestream, but they do not show up (most probably because they are wrong). In between (only in some frames) I suddenly also receive values that seem more reasonable:

[[15526  6153]
 [11828  4397]
 [10722  3485]
 [13921  4835]]

...but also no plotted points on my screen. And of course it's weird that those values differ so much from the other values, I really don't know why.

Am I missing some scaling after Step 3 or something? Or did I misunderstand something with the coordinate transformation? I couldn't really find anything on this problem. Any tips in the right direction will be greatly appreciated! Excuse me, if there are any super dumb errors, this is my first project with both Python and OpenCV.

This is a working example:

import numpy as np
import cv2

# rotate a markers corners by rvec and translate by tvec if given input is the size of a marker.
# In the markerworld the 4 markercorners are at (x,y) = (+- markersize/2, +- markersize/2)
# returns the rotated and translated corners to camera world and the rotation matrix
def rotate_marker_corners(rvec, markersize, tvec = None):

    mhalf = markersize / 2.0
    # convert rot vector to rot matrix both do: markerworld -> cam-world
    mrv, jacobian = cv2.Rodrigues(rvec)

    #in markerworld the corners are all in the xy-plane so z is zero at first
    X = mhalf * mrv[:,0] #rotate the x = mhalf
    Y = mhalf * mrv[:,1] #rotate the y = mhalf
    minusX = X * (-1)
    minusY = Y * (-1)

    # calculate 4 corners of the marker in camworld. corners are enumerated clockwise
    markercorners = []
    markercorners.append(minusX + Y) #was upper left in markerworld
    markercorners.append(X + Y) #was upper right in markerworld
    markercorners.append(X + minusY) #was lower right in markerworld
    markercorners.append(minusX + minusY) #was lower left in markerworld
    # if tvec given, move all by tvec
    if tvec is not None:
        C = tvec #center of marker in camworld
        for i, mc in enumerate(markercorners):
            markercorners[i] = C + mc #add tvec to each corner

    markercorners = np.array(markercorners,dtype=np.float32) # type needed when used as input to cv2
    return markercorners, mrv

def loadCoefficients():
    cv_file = cv2.FileStorage("calibrationCoefficients.yaml", cv2.FILE_STORAGE_READ)
    camera_matrix = cv_file.getNode("camera_matrix").mat()
    dist_matrix = cv_file.getNode("dist_coeff").mat()

    cv_file.release()
    return [camera_matrix, dist_matrix]

# load the ArUCo dictionary and grab the ArUCo parameters
arucoDict = cv2.aruco.Dictionary_get(cv2.aruco.DICT_6X6_50)
arucoParams = cv2.aruco.DetectorParameters_create()

# initialize video stream
cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)

# loop over the frames from the video stream
while True:
    # grab the frame from the threaded video stream
    frame = np.zeros((480,640))
    res, frame = cap.read()
    # detect ArUco markers in the input frame
    (corners, ids, rejected) = cv2.aruco.detectMarkers(frame, arucoDict, parameters=arucoParams) #returns relative position of camera in marker's world
    mtx, dist = loadCoefficients()
    # verify *at least* one ArUco marker was detected
    if len(corners) > 0:
        # flatten the ArUco IDs list
        ids = ids.flatten()

        for i in range(0, len(ids)):
            # draw detected markers on screen
            cv2.aruco.drawDetectedMarkers(frame, corners, ids)
            #extract rotation vector and translation vector for each marker, 0.012 being my marker length in meters
            rvec, tvec, _ =  cv2.aruco.estimatePoseSingleMarkers(corners, 0.012, mtx, dist)
            # draw axes onto marker for checking pose estimation
            cv2.drawFrameAxes(frame, mtx, dist, rvec[i], tvec[i], 0.012)
            # compute 3D coordinates of marker's corners in camera space
            cornerCoordinates, _ = rotate_marker_corners(rvec[i], 0.012, tvec[i])
            # compute 2D coordinates of marker's corners in camera space
            reducedDimensionsto2D, _ = cv2.projectPoints(cornerCoordinates, rvec[i], tvec[i], mtx, dist)
            reducedDimensionsto2D = np.int32(reducedDimensionsto2D).reshape(-1, 2) # reshape list for better readability
            print(reducedDimensionsto2D) # debugging
            # Draw square of projected points on the livestream for debugging
            cv2.polylines(frame, [np.int32(reducedDimensionsto2D)], True, (255, 0, 0), 3)

    # show the output frame
    cv2.imshow("Frame", frame)
    key = cv2.waitKey(1) & 0xFF
    # if the `q` key was pressed, break from the loop
    if key == ord("q"):
        break
    
cv2.destroyAllWindows()

And these are my camera calibrations to be saved as calibrationCoefficients.yaml:

%YAML:1.0
---
    camera_matrix: !!opencv-matrix
        rows: 3
        cols: 3
        dt: d
        data: [ 1.5026752406242376e+03, 0., 9.2256504451330136e+02, 0.,
            1.5113357377940451e+03, 4.4730181258850007e+02, 0., 0., 1. ]
    dist_coeff: !!opencv-matrix
        rows: 1
        cols: 5
        dt: d
        data: [ 1.3350912805940560e-01, -4.5402165375909015e-01,
            -2.3676773666038882e-02, -7.3028961943791158e-03,
            8.2049803330142934e-01 ]

These are my markers to test with (I cut them out along the border around each of them): Test marker sheet

don't use `np.add`. that makes the code *unreadable*. just use `+`. — Christoph Rackwitz, Aug 09 '22 at 15:20
I corrected that now @ChristopRackwitz. To be honest basically copied that part from the linked answer... oops, my bad. — laury, Aug 09 '22 at 15:34
code in random stack overflow answers is often junk, even if they've got upvotes and lots of rep, and in questions even more frequently. lots of novices write answers. take everything with a mountain of salt.... and if you copy someone's code and then you have issues with it, be prepared to be *held accountable* for that code because now it's yours, you adopted it. that also means you are responsible for debugging and understanding it. just saying... if you have questions, best to ask the author of the code before you ask anyone else. — Christoph Rackwitz, Aug 09 '22 at 15:35
oh, and markers require a "quiet zone" (white border) around them, or else they're undetectable. that picture you link there won't do. — Christoph Rackwitz, Aug 09 '22 at 15:37
as for the question... `detectMarkers` already gives you screen-space coordinates. going the detour through pose estimation, creating 3d points, transforming with rvec and tvec, and `projectPoints`, _should_ give you more or less the same result... those large integer numbers shouldn't happen and I don't see where you get them from. — Christoph Rackwitz, Aug 09 '22 at 15:48
Yes sure, I recognize that @ChristophRackwitz. Will be looking into that part of code again, but I'm struggling to wrap my head around the maths stuff. That's why I asked for help. I uploaded my test marker sheet instead of the marker image now, I of course had added a border around it :-) — laury, Aug 09 '22 at 15:50
...and now that you point that out - I might just be overcomplicating things by trying the way my professor suggested, you're right... -.- Might just stick with my own idea, that apparently made more sense and will save me from this issue. — laury, Aug 09 '22 at 15:56
we can get this working. I just don't see what's the issue with that code, besides it doing some weird stuff that may not even be needed. projectPoints takes an rvec and tvec, so it transforms your "object-frame" points to camera-frame before projecting. that code you got there tries to do it explicitly. -- perhaps scratch that entire code and just state the goal. — Christoph Rackwitz, Aug 09 '22 at 16:21
Ok, so the ultimate goal of my project would be to _detect colors around_ the ArUco marker (I have the marker in the center of a Rubik's cube's face and want to dynamically track all the colors on that face). I was already able to draw a polygon around my cube's side by basically "extending" the tracked marker's border. That worked well with `projectPoints`, so maybe I could use those coordinates from that polygon contour @ChristophRackwitz? — laury, Aug 09 '22 at 16:43
ic. so... define some points that would sit in the middle of the eight surrounding squares, relative to one marker (literally `(x, y, 0)` points in whatever is your base unit, mm or something), project them, and sample the picture at those points. that's doable. just define the points in the marker's coordinate system, and use projectPoints, giving it the rvec and tvec from pose estimation. doesn't require any math. — Christoph Rackwitz, Aug 09 '22 at 17:30
Actually just made that work, because I had the same idea after writing my comment! I projected the points as dots on my livestream to check, and it works like a charm. Super happy right now, I will now use the extracted coordinates for color detection. Almost always good to stick to the simple ideas, should have done that from the beginning :-) Thanks anyway @ChristophRackwitz! Maybe I just needed that thought input to stick to my initial idea. — laury, Aug 09 '22 at 17:35

OpenCV: Getting 2D camera space coordinates of ArUco marker corners

0 Answers0