I'm estimating the translation and rotation of a single camera using the following code.
E, mask = cv2.findEssentialMat(k1, k2,
focal = SCALE_FACTOR * 2868
pp = (1920/2 * SCALE_FACTOR, 1080/2 * SCALE_FACTOR),
method = cv2.RANSAC,
prob = 0.999,
threshold = 1.0)
points, R, t, mask = cv2.recoverPose(E, k1, k2)
where k1
and k2
are my matching set of key points, which are Nx2 matrices where the first column is the x-coordinates and the second column is y-coordinates.
I collect all the translations over several frames and generate a path that the camera traveled like this.
def generate_path(rotations, translations):
path = []
current_point = np.array([0, 0, 0])
for R, t in zip(rotations, translations):
path.append(current_point)
# don't care about rotation of a single point
current_point = current_point + t.reshape((3,)
return np.array(path)
So, I have a few issues with this.
- The OpenCV camera coordinate system suggests that if I want to view the 2D "top down" view of the camera's path, I should plot the translations along the X-Z plane.
plt.plot(path[:,0], path[:,2])
This is completely wrong.
However, if I write this instead
plt.plot(path[:,0], path[:,1])
I get the following (after doing some averaging)
This path is basically perfect.
So, perhaps I am misunderstanding the coordinate system convention used by cv2.recoverPose
? Why should the "birds eye view" of the camera path be along the XY plane and not the XZ plane?
- Another, perhaps unrelated issue is that the reported Z-translation appears to decrease linearly, which doesn't really make sense.
I'm pretty sure there's a bug in my code since these issues appear systematic - but I wanted to make sure my understanding of the coordinate system was correct so I can restrict the search space for debugging.