I'm trying to write an app that allows users to georeference an image of a map by clicking waypoints on the map, thereby determining pixel coordinates x
and y
, and subsequently clicking on the same waypoint on a 'real' map resulting in a longitude lng
and latitude lng
.
I'd like to take it as given that both the image and 'real' map are oriented north-south, so that there is no rotation between them, and also that there is a single scale for north-south and east-west. That is:
lng(x) = scale * x + a
lat(y) = -scale * y + b
(The reason for the minus sign is that the y
pixel coordinate increases from the top of the image to the bottom, whereas latitude lat
increases from south to north).
I've adapted the answer to how to perform coordinates affine transformation using python? part 2 as follows:
import numpy as np
coords = [
{"pixel": {"x": 610, "y": 1673}, "lnglat": {
"lng": -119.66622566136141, "lat": 37.71690293889708}},
{"pixel": {"x": 3616, "y": 948}, "lnglat": {
"lng": -119.55987333997541, "lat": 37.739791632115}},
{"pixel": {"x": 156, "y": 1582}, "lnglat": {
"lng": -119.68242540789811, "lat": 37.719168689576634}},
{"pixel": {"x": 1432, "y": 1079}, "lnglat": {
"lng": -119.63773163590452, "lat": 37.733899511112554}},
{"pixel": {"x": 1467, "y": 982}, "lnglat": {
"lng": -119.6365899951677, "lat": 37.73740878429034}},
{"pixel": {"x": 2045, "y": 464}, "lnglat": {
"lng": -119.61643210247348, "lat": 37.75263501532096}},
{"pixel": {"x": 2530, "y": 225}, "lnglat": {
"lng": -119.59904847563081, "lat": 37.759640099263024}},
{"pixel": {"x": 3611, "y": 217}, "lnglat": {
"lng": -119.57440674003465, "lat": 37.769372182124215}},
{"pixel": {"x": 4218, "y": 289}, "lnglat": {
"lng": -119.53927620600871, "lat": 37.7590418448261}},
{"pixel": {"x": 4972, "y": 819}, "lnglat": {
"lng": -119.51283799895947, "lat": 37.7451015130886}},
{"pixel": {"x": 4869, "y": 1178}, "lnglat": {
"lng": -119.5150031101931, "lat": 37.73452849532761}},
{"pixel": {"x": 4858, "y": 1268}, "lnglat": {
"lng": -119.51537412573026, "lat": 37.731943969799104}},
{"pixel": {"x": 4637, "y": 1307}, "lnglat": {
"lng": -119.52293169964986, "lat": 37.730726899819345}},
{"pixel": {"x": 4284, "y": 1599}, "lnglat": {
"lng": -119.53520554208092, "lat": 37.72240153076238}},
{"pixel": {"x": 4150, "y": 1676}, "lnglat": {
"lng": -119.53996905111126, "lat": 37.71984653680312}},
{"pixel": {"x": 3432, "y": 1989}, "lnglat": {
"lng": -119.56520552108367, "lat": 37.70994983543632}},
{"pixel": {"x": 2965, "y": 1408}, "lnglat": {
"lng": -119.58234774459186, "lat": 37.72663636959598}},
{"pixel": {"x": 2560, "y": 1921}, "lnglat": {
"lng": -119.59584076119313, "lat": 37.712008849961066}},
{"pixel": {"x": 1840, "y": 1593}, "lnglat": {
"lng": -119.6231396666414, "lat": 37.72018991118786}},
{"pixel": {"x": 1140, "y": 1590}, "lnglat": {
"lng": -119.64782744839357, "lat": 37.71938854312988}},
]
pixel_coordinates = np.array([[coord['pixel']['x'], coord['pixel']['y']] for coord in coords])
lnglat_coordinates = np.array([[coord['lnglat']['lng'], coord['lnglat']['lat']] for coord in coords])
# Pad the data with ones, so that our transformation can do translations too
n = pixel_coordinates.shape[0]
def pad(x):
return np.hstack([x, np.ones((x.shape[0], 1))])
def unpad(x):
return x[:, :-1]
X = pad(pixel_coordinates)
Y = pad(lnglat_coordinates)
# Solve the least squares problem X * A = Y
# to find our transformation matrix A
A, res, rank, s = np.linalg.lstsq(X, Y, rcond=None)
def transform(x):
return unpad(np.dot(pad(x), A))
print("Target:")
print(lnglat_coordinates)
print("Result:")
print(transform(pixel_coordinates))
print("Max error:", np.abs(lnglat_coordinates - transform(pixel_coordinates)).max())
print(A)
The resulting matrix A
looks like this:
[[ 3.55857577e-05 8.98377941e-07 2.81630741e-18]
[ 3.43101520e-06 -2.97714115e-05 -8.56519716e-18]
[-1.19693144e+02 3.77657997e+01 1.00000000e+00]]
I would expect it to have the form
[[scale 0 0]
[0 -scale 0]
[a b 1]]
What I notice is that the independently derived values of scale
actually differ by ~16% and that the off-diagonal columns which are supposed to be zero aren't.
(In practice, I also notice that the overlay computed with this algorithm is significantly off, as shown below with partial opacity. Note how the roads appear to be shifted to the northwest).
Is there any way to impose these constraints on the least-squares estimation of the affine transformation - that is, to ensure that the transformation consists of scaling and translation only (no skewing or rotation)?