Suppose img0
contains the target you want to locate, and img1
, img2
, img3
have the target location at (x1, y1)
, (x2, y2)
, (x3, y3)
respectively. If you know the homographies between all the image pairs (say, H1
, H2
, and H3
are the homographies from img1
, img2
, and img3
to img0
), then you can simply multiply those pixel locations by the homographies to obtain the estimated coordinates in img0
. You can't exactly triangulate, but you can find the estimated warp point from all three images and homographies, and take an average, or combine them however you like, and that should give you a great estimate, as long as your homographies are accurate enough.
Homography matrices are 3x3 matrices and your points are a 2-vector. In order to apply your transformation (or multiply the matrix and vector), you need homogeneous coordinates; where your points are in the form (x, y, 1)
. Then, to get the location in img0
of a pixel from img1
, multiplication looks like:
[s*x0] [x1] [h00 h01 h02] [x1]
[s*y0] = H1 * [y1] = [h10 h11 h12] * [y1]
[s ] [ 1] [h20 h21 h22] [ 1]
Your output will be a vector, but not homogeneous. The resulting point will have a scaling factor, s
; divide by s
to obtain your final coordinates (x0, y0)
. Simply do this for the three target locations you do have and those corresponding homographies, and you'll end up with three estimated positions and you can then average them.
Full example.
This is using some images that are provided with ground truth data, available here.
Here's my results.
The location of the cross in the middle of the chicken's eye is estimated in the first image from knowing the locations in the other three and the ground truth homographies given in the dataset. I output the estimations from each homography, the average of them, and the rounded pixel values, and it turns out that the rounded estimated pixel values are exact (because the homographies are very accurate).
Estimations:
2 -> 1: [ 527.15670903 222.57196904]
3 -> 1: [ 527.21339222 221.86819147]
4 -> 1: [ 527.63122722 222.30614892]
Avg loc: [ 527.33377616 222.24876981]
Est loc: [527 222]
True loc: [527 222]
Here's a full coded example, just download the images from that data set and pop this into a script in that folder and run.
import numpy as np
import cv2
# read images, taken from http://kahlan.eps.surrey.ac.uk/featurespace/web/
img1 = cv2.imread("img1.png", 1)
img2 = cv2.imread("img2.png", 1)
img3 = cv2.imread("img3.png", 1)
img4 = cv2.imread("img4.png", 1)
# true locations of the chicken's crossed eye (labeled myself)
loc1 = np.array([527, 222, 1])
loc2 = np.array([449, 241, 1])
loc3 = np.array([476, 275, 1])
loc4 = np.array([385, 236, 1])
# define ground truth homographies, also from http://kahlan.eps.surrey.ac.uk/featurespace/web/
H12 = np.array([
[8.7976964e-01, 3.1245438e-01, -3.9430589e+01],
[-1.8389418e-01, 9.3847198e-01, 1.5315784e+02],
[1.9641425e-04, -1.6015275e-05, 1.0000000e+00]])
H13 = np.array([
[7.6285898e-01, -2.9922929e-01, 2.2567123e+02],
[3.3443473e-01, 1.0143901e+00, -7.6999973e+01],
[3.4663091e-04, -1.4364524e-05, 1.0000000e+00]])
H14 = np.array([
[6.6378505e-01, 6.8003334e-01, -3.1230335e+01],
[-1.4495500e-01, 9.7128304e-01, 1.4877420e+02],
[4.2518504e-04, -1.3930359e-05, 1.0000000e+00]])
# need the homographies going the other direction
H21 = np.linalg.inv(H12)
H31 = np.linalg.inv(H13)
H41 = np.linalg.inv(H14)
# ensure they are homogeneous by dividing by the last entry
H21 = H21/H21[-1,-1]
H31 = H31/H31[-1,-1]
H41 = H41/H41[-1,-1]
# warp the locations loc2, loc3, loc4 to the coordinates of img1
est21 = np.matmul(H21, loc2)
est31 = np.matmul(H31, loc3)
est41 = np.matmul(H41, loc4)
# make homogeneous, toss the final 1
est21 = est21[:-1]/est21[-1]
est31 = est31[:-1]/est31[-1]
est41 = est41[:-1]/est41[-1]
# remove the last coordinate, take an average
avgest = (est21 + est31 + est41)/3
estloc = np.around(avgest).astype(int)
# output
print("Estimations:"
"\n2 -> 1: ", est21,
"\n3 -> 1: ", est31,
"\n4 -> 1: ", est41,
"\nAvg loc: ", avgest,
"\nEst loc: ", estloc,
"\nTrue loc: ", loc1[:-1])
# show images
cv2.circle(img1, (estloc[0], estloc[1]), 2, (0,0,255), -1) # filled
cv2.circle(img1, (estloc[0], estloc[1]), 20, (255,255,255)) # outline
cv2.imshow('img1-est', img1)
cv2.waitKey(0)
cv2.circle(img2, (loc2[0], loc2[1]), 2, (0,0,255), -1) # filled
cv2.circle(img2, (loc2[0], loc2[1]), 20, (255,255,255)) # outline
cv2.imshow('img2-loc', img2)
cv2.waitKey(0)
cv2.circle(img3, (loc3[0], loc3[1]), 2, (0,0,255), -1) # filled
cv2.circle(img3, (loc3[0], loc3[1]), 20, (255,255,255)) # outline
cv2.imshow('img3-log', img3)
cv2.waitKey(0)
cv2.circle(img4, (loc4[0], loc4[1]), 2, (0,0,255), -1) # filled
cv2.circle(img4, (loc4[0], loc4[1]), 20, (255,255,255)) # outline
cv2.imshow('img4-log', img4)
cv2.waitKey(0)
cv2.destroyAllWindows()