6

We are working on an AR application in which we need to overlay a 3D model of an object on a video stream of the object. A Unity scene contains the 3D model and a camera is filming the 3D object. The camera pose is initially unknown.

What we have tried

We did not find a good solution to estimate the camera pose directly in Unity. We, therefore, used OpenCV which provides an extensive library of computer vision functions. In particular, we locate Aruco tags and then pass their matching 3D-2D coordinates to solvePnp.

solvePnp returns a camera position that is consistent with the reality up to a few centimeters. We also verify the reprojection error which is low.

Reprojection of Aruco tags

Each used tag corner is reprojected and shown as a red point on the image. As you can see, the difference is minimal.

These results look decent and should be sufficient for our use-case. We, therefore, validate the camera pose in regard with reality and OpenCV.

The problem

When placing the camera at the estimated pose in the Unity scene however, the 3D objects do not line up well.

Unity view of the markers and the video stream

On this Unity screenshot, you can see that the view of the virtual (Unity objects) green tags does not match with the real ones from the video feed.

Possible root cause

We identified different possible root causes that could explain the mismatch between Unity and OpenCV:

  • Differences in camera intrinsic parameters: we tried different sets of parameters, none with absolute success. We first calibrated the camera with OpenCV and tried to backport the parameters to Unity. We also looked at the manufacturer data but it didn't provide better results. Lastly, we manually measured the Field of View (FoV) and combined it with the known camera sensor size. Results didn't differ much between these tests.
  • Differences in camera model between Unity and OpenCV: OpenCV works with a pinhole camera model but I was not able to find a conclusive answer on which model Unity simulates.

Notes

Our camera has a large field of view (115°).

Both the image passed to OpenCV and to Unity are already well undistorted.

We went through most of the SO questions tagged OpenCV and Unity. Most were concerned with the different coordinates system and rotation convention. This doesn't seem to be the problem in our case as the camera is shown at its expected location in the 3D Unity scene.

Questions

  1. Is there any fundamental difference in the camera model used by Unity and OpenCV?
  2. Do you see any other possible causes that could explain the difference in projection between Unity and OpenCV?
  3. Do you know of any reliable way to estimate the camera pose without OpenCV?
vwvw
  • 372
  • 4
  • 16
  • 1
    camera intrinsics were my first guess. Did you perform a distortion correction? This is very important for wide-fov cameras. After computing the intrinsics and distortion parameters, you'll have to undistort the image – Micka Dec 17 '19 at 14:12
  • Yes, we tried two different distortion correction. The first one with data from an OpenCV calibration round of 30 images. Results were marginally worse. The image in the question are undistorted with an UV map created to straighten lines from a chessboard filmed such that it takes the whole image. The resulting undistorted image is visually pleasing and has the expected charasteritics (parallels,etc..). I doubt that the error is coming from the undistortion but I'll give it another look! – vwvw Dec 17 '19 at 15:33
  • did you perform intrinsics (K matrix) and solvePnp after distortion correction (assuming that the undistorted image is your actual camera image)? – Micka Dec 17 '19 at 17:05
  • We performed both. First we used the distorted image (which resulted in higher value in the distortion coefficient matrix). The current state of the project is to work with the undistorted image and therefore low distortion coefficient. Is there any blunder possible if we use the undistorted image? – vwvw Dec 18 '19 at 09:27
  • In my experience, this is due to the focal length difference between openCV calculation and the value used in Unity. You can try actively changing the value of fx and fy in the camera intrinsic value when calculating the reprojection matrix and reproject the unity 3D object to observe the direction of change. – yapws87 Dec 31 '19 at 15:01
  • im having the same problem now, did you find the solution? – xXNukem KS Jan 11 '21 at 14:40

0 Answers0