5

I am trying to find the transformation matrix H so that i can multiply the (x,y) pixel coordinates and get the (x,y) real world coordinates. Here is my code:

import cv2
import numpy as np
from numpy.linalg import inv
if __name__ == '__main__' :
D=[159.1,34.2]
I=[497.3,37.5]
G=[639.3,479.7]
A=[0,478.2]
# Read source image.
im_src = cv2.imread('/home/vivek/june_14.png')
# Four corners of the book in source image
pts_src = np.array([D,I,G,A])

# Read destination image.
im_dst = cv2.imread('/home/vivek/june_14.png')

# Four corners of the book in destination image.
print "img1 shape:",im_dst.shape
scale=1
O=[0.0,0.0]
X=[134.0*scale,0]
Y=[0.0,184.0*scale]
P=[134.0*scale,184.0*scale]
# lx = 75.5 * scale
# ly = 154.0 * scale
pts_dst = np.array([O,X,P,Y])

# Calculate Homography
h, status = cv2.findHomography(pts_src, pts_dst)

print "homography:",h
print "inv of H:",inv(h)
print "position of the blob on the ground xy plane:",np.dot(np.dot(h,np.array([[323.0],[120.0],[1.0]])),scale)


# Warp source image to destination based on homography

im_out = cv2.warpPerspective(im_src, h, (im_dst.shape[1],im_dst.shape[0]))

# Display images
cv2.imshow("Source Image", im_src)
cv2.imshow("Destination Image", im_dst)
cv2.imshow("Warped Source Image", im_out)
cv2.imwrite("im_out.jpg", im_out)
cv2.waitKey(0)

The global xy's i am getting are very off. Am i doing something wrong somewhere?

Vivek Annem
  • 63
  • 1
  • 3
  • Sorry but what are the variables `D,I,G,A,O,X,P,Y`? What are these supposed to represent? Anyways, where you compute the "real world" `(x,y)` coordinates, you will get *homogenous* points which are equivalent when they're scaled---in other words, they may be scaled and would still be considered the same point. But you need `x,y` points, not scaled ones, so you need to divide by the scale. The three-vector is all scaled by the same amount, so you can use the last entry for the scaling factor. You should do `pts = scale*np.dot(h,np.array([[323.0],[120.0],[1.0]]))` and then `pts = pts/pts[-1]`. – alkasm Jun 16 '17 at 00:34
  • OXPY are real word points (O-origin,X-134 inches to the right,P-134 inches to the right & 184 inches down,Y-184 inches down) and DIGA are the respective pixel coordinates on the image plane. – Vivek Annem Jun 16 '17 at 00:44
  • I am sorry. I didnt quite get the scaling part. – Vivek Annem Jun 16 '17 at 00:45
  • I think I understand what you meant. Let me check and see if I am getting the right values. – Vivek Annem Jun 16 '17 at 00:53
  • Try out the code real quick and see if it solves your problem---if it does I'll add a more elaborate answer. Basically, it is not true that `[x', y', 1] = H*[x, y, 1]`; it is `[sx', sy', s] = H*[x, y, 1]`. But you need the homogenous points to end with a `1` to be equivalent to `x,y` coordinates, so you need to divide by `s` from the homogenous coordinates. – alkasm Jun 16 '17 at 00:54
  • The problem is solved. Thanks a lot, @AlexanderReynolds – Vivek Annem Jun 16 '17 at 01:15
  • I'll add it as an answer. Glad it worked! – alkasm Jun 16 '17 at 03:52

2 Answers2

10

The long answer

Homographies are 3x3 matrices and points are just pairs, 2x1, so there's no way to map these together. Instead, homogeneous coordinates are used, giving 3x1 vectors to multiply. However, homogeneous points can be scaled while representing the same point; that is, in homogeneous coordinates, (kx, ky, k) is the same point as (x, y, 1). From the Wikipedia page on homogeneous coordinates:

Given a point (x, y) on the Euclidean plane, for any non-zero real number Z, the triple (xZ, yZ, Z) is called a set of homogeneous coordinates for the point. By this definition, multiplying the three homogeneous coordinates by a common, non-zero factor gives a new set of homogeneous coordinates for the same point. In particular, (x, y, 1) is such a system of homogeneous coordinates for the point (x, y). For example, the Cartesian point (1, 2) can be represented in homogeneous coordinates as (1, 2, 1) or (2, 4, 2). The original Cartesian coordinates are recovered by dividing the first two positions by the third. Thus unlike Cartesian coordinates, a single point can be represented by infinitely many homogeneous coordinates.

Obviously, in cartesian coordinates, this scaling does not hold; (x, y) is not the same point as (xZ, yZ) unless Z = 0 or Z = 1. So we need a way to map these homogeneous coordinates, which can be represented an infinite number of ways, down to Cartesian coordinates, which can only be represented one way. Luckily this is very easy, just scale the homogeneous coordinates so the last number in the triple is 1.

Homographies multiply homogeneous coordinates and return homogeneous coordinates. So in order to map them back to Cartesian world, you just need to divide by the last coordinate to scale them and then rip the first two numbers out.

The short answer

When you multiply homogeneous coordinates by a homography, you need to scale them:

sx'       x
sy' = H * y
s         1

So to get back to Cartesian coordinates, divide the new homogeneous coordinates by s: (sx', sy', s)/s = (x', y', 1) and then (x', y') are the points you want.

The shorter answer

Use the built-in OpenCV function convertPointsFromHomogeneous() to convert your points from homogeneous 3-vectors to Cartesian 2-vectors.

alkasm
  • 22,094
  • 5
  • 78
  • 94
1
    # convert to world coordinates
    def toworld(x,y):
        imagepoint = [x, y, 1]
        worldpoint = np.array(np.dot(inversehomographymatrix,imagepoint))
        scalar = worldpoint[0,2]
        xworld = int((worldpoint[0][0]/scalar)*10 +p.x_buffer_width*10)
        yworld = int((worldpoint[0][1]/scalar)*10 +p.y_buffer_width*10) #in 10demm 
        return xworld, yworld
JEPE
  • 19
  • 6
  • Just tried this answer, it helps a lot, a corrected version is `def toworld(x,y):imagepoint = [x, y, 1];worldpoint = np.array(np.dot(inversehomographymatrix,imagepoint));scalar = worldpoint[2];xworld = worldpoint[0]/scalar;yworld = worldpoint[1]/scalar;return xworld, yworld` – CocoaBob Jul 23 '18 at 18:32