How do you triangulate the location of a pixel in an image using it's location in other images?

Question

I have four aerial images of a scene from slightly different locations. Each image has a the same cross visible in each image. I know the exact location (pixel x and y) of the cross in 3 of the images as well as the homographies and fundamental matrices between all pairs of images.

Is it possible to use this information to exactly triangulate the location of the cross in the fourth image? Unfortunately if I just map the location for the cross from any image into the fourth image with the homographies there is some error. It's nearby but not exact, I'm hoping there is a way to constrain the search, maybe using epipolar lines?

Do you know the homography between one of the images and the forth image? If yes use it to transform the cross point from one image to the forth one by multiplaying it with the homography. There are assumption like that the two cameras as the same intrinsic camera parameters. — Amitay Nachmani, May 17 '17 at 05:07
Unfortunately like I said if I map the point with the homography there is an error. I'm looking to use the fact that I know the exact location in three images. The homographies won't agree with this exactly though so is there a way to update them ... — nickponline, May 17 '17 at 19:16

alkasm · Accepted Answer · 2017-05-17T21:18:19.547

Suppose img0 contains the target you want to locate, and img1, img2, img3 have the target location at (x1, y1), (x2, y2), (x3, y3) respectively. If you know the homographies between all the image pairs (say, H1, H2, and H3 are the homographies from img1, img2, and img3 to img0), then you can simply multiply those pixel locations by the homographies to obtain the estimated coordinates in img0. You can't exactly triangulate, but you can find the estimated warp point from all three images and homographies, and take an average, or combine them however you like, and that should give you a great estimate, as long as your homographies are accurate enough.

Homography matrices are 3x3 matrices and your points are a 2-vector. In order to apply your transformation (or multiply the matrix and vector), you need homogeneous coordinates; where your points are in the form (x, y, 1). Then, to get the location in img0 of a pixel from img1, multiplication looks like:

[s*x0]        [x1]   [h00 h01 h02]   [x1]
[s*y0] = H1 * [y1] = [h10 h11 h12] * [y1]
[s   ]        [ 1]   [h20 h21 h22]   [ 1]

Your output will be a vector, but not homogeneous. The resulting point will have a scaling factor, s; divide by s to obtain your final coordinates (x0, y0). Simply do this for the three target locations you do have and those corresponding homographies, and you'll end up with three estimated positions and you can then average them.

Full example.

This is using some images that are provided with ground truth data, available here.

Here's my results.

The location of the cross in the middle of the chicken's eye is estimated in the first image from knowing the locations in the other three and the ground truth homographies given in the dataset. I output the estimations from each homography, the average of them, and the rounded pixel values, and it turns out that the rounded estimated pixel values are exact (because the homographies are very accurate).

Estimations:
2 -> 1:    [ 527.15670903  222.57196904] 
3 -> 1:    [ 527.21339222  221.86819147] 
4 -> 1:    [ 527.63122722  222.30614892] 
Avg loc:   [ 527.33377616  222.24876981] 
Est loc:   [527 222] 
True loc:  [527 222]

Here's a full coded example, just download the images from that data set and pop this into a script in that folder and run.

import numpy as np
import cv2

# read images, taken from http://kahlan.eps.surrey.ac.uk/featurespace/web/
img1 = cv2.imread("img1.png", 1)
img2 = cv2.imread("img2.png", 1)
img3 = cv2.imread("img3.png", 1)
img4 = cv2.imread("img4.png", 1)

# true locations of the chicken's crossed eye (labeled myself)
loc1 = np.array([527, 222, 1])
loc2 = np.array([449, 241, 1])
loc3 = np.array([476, 275, 1])
loc4 = np.array([385, 236, 1])

# define ground truth homographies, also from http://kahlan.eps.surrey.ac.uk/featurespace/web/
H12 = np.array([
    [8.7976964e-01,   3.1245438e-01,  -3.9430589e+01],
    [-1.8389418e-01,   9.3847198e-01,   1.5315784e+02],
    [1.9641425e-04,  -1.6015275e-05,   1.0000000e+00]])

H13 = np.array([
   [7.6285898e-01,  -2.9922929e-01,   2.2567123e+02],
   [3.3443473e-01,   1.0143901e+00,  -7.6999973e+01],
   [3.4663091e-04,  -1.4364524e-05,   1.0000000e+00]])

H14 = np.array([
   [6.6378505e-01,   6.8003334e-01,  -3.1230335e+01],
   [-1.4495500e-01,   9.7128304e-01,   1.4877420e+02],
   [4.2518504e-04,  -1.3930359e-05,   1.0000000e+00]])

# need the homographies going the other direction
H21 = np.linalg.inv(H12)
H31 = np.linalg.inv(H13)
H41 = np.linalg.inv(H14)

# ensure they are homogeneous by dividing by the last entry
H21 = H21/H21[-1,-1]
H31 = H31/H31[-1,-1]
H41 = H41/H41[-1,-1]

# warp the locations loc2, loc3, loc4 to the coordinates of img1
est21 = np.matmul(H21, loc2)
est31 = np.matmul(H31, loc3)
est41 = np.matmul(H41, loc4)

# make homogeneous, toss the final 1
est21 = est21[:-1]/est21[-1]
est31 = est31[:-1]/est31[-1]
est41 = est41[:-1]/est41[-1]

# remove the last coordinate, take an average
avgest = (est21 + est31 + est41)/3
estloc = np.around(avgest).astype(int)

# output
print("Estimations:"
    "\n2 -> 1:   ", est21,
    "\n3 -> 1:   ", est31,
    "\n4 -> 1:   ", est41,
    "\nAvg loc:  ", avgest, 
    "\nEst loc:  ", estloc,
    "\nTrue loc: ", loc1[:-1])

# show images
cv2.circle(img1, (estloc[0], estloc[1]), 2, (0,0,255), -1) # filled
cv2.circle(img1, (estloc[0], estloc[1]), 20, (255,255,255)) # outline
cv2.imshow('img1-est', img1)
cv2.waitKey(0)

cv2.circle(img2, (loc2[0], loc2[1]), 2, (0,0,255), -1) # filled
cv2.circle(img2, (loc2[0], loc2[1]), 20, (255,255,255)) # outline
cv2.imshow('img2-loc', img2)
cv2.waitKey(0)

cv2.circle(img3, (loc3[0], loc3[1]), 2, (0,0,255), -1) # filled
cv2.circle(img3, (loc3[0], loc3[1]), 20, (255,255,255)) # outline
cv2.imshow('img3-log', img3)
cv2.waitKey(0)

cv2.circle(img4, (loc4[0], loc4[1]), 2, (0,0,255), -1) # filled
cv2.circle(img4, (loc4[0], loc4[1]), 20, (255,255,255)) # outline
cv2.imshow('img4-log', img4)
cv2.waitKey(0)

cv2.destroyAllWindows()

Unfortunately like I said if I map the point with the homography there is an error. I'm looking to use the fact that I know the exact location in three images. The homographies won't agree with this exactly though so is there a way to update them ... — nickponline, May 17 '17 at 19:16
There's a lot to sort out for error correction in homography estimation, though if you're off a decent amount like you show, your main problem is your homography estimation. Do you have ground truth homographies or are you computing them? If the latter, how? The main problem here is homographies tell you *directly* where pixels are already, no triangulation necessary. So if you're off, your homographies are off. You could define an ROI around the estimated target location, and create a better homography for that region. Or you could even use template matching. — alkasm, May 17 '17 at 20:14
Thanks for working on this I appreciate it. Problem here is you have ground truth homographies. I'm computing the homographies myself using SIFT and RANSAC and the error is probably being introduced because the scenes aren't perfectly planar. Instead of using homographies would using the fundamental matrix make more sense in this case? — nickponline, May 18 '17 at 05:40
Mathematically the fundamental matrix and homography can directly map to each other equally so I don't see how one would be more error prone. I might suggest using *direct* approaches via Lukas-Kanade here instead of feature-based, since the scenes are very devoid of points of interest. Especially since the target *is* on a plane, I think making ROIs around the estimated location in `img1` and around true locations from `img2`, `img3`, and `img4`, and estimating new homographies from there would work, especially since you already have an approximate homography to start from. — alkasm, May 18 '17 at 06:13
@nickponline I have a custom LKIC routine that I programmed awhile ago, I'd love to try this method with your scene if you could provide the images you're using. There is a good post on error correction for homographies [here](http://stackoverflow.com/questions/42152764/minimize-error-in-homography-matrix) TBH I'm not really sure if epipolar lines can help you or not; it might though. I found a [paper](http://www.bmva.org/bmvc/2011/proceedings/paper98/paper98.pdf) which uses shape constraints of features to give better results, maybe that'll help you. — alkasm, May 18 '17 at 08:01
Cool thanks, going to accept the answer and check out the links. Happy to send all the source images. Let me know how. — nickponline, May 22 '17 at 15:50
You can just comment with imgur links or Dropbox or something, or feel to find my email through my website on my profile. @nickponline — alkasm, May 22 '17 at 17:39

How do you triangulate the location of a pixel in an image using it's location in other images?

1 Answers1

Full example.