Python, OpenCV -- Aligning and overlaying multiple images, one after another

Question

My project is to align aerial photos to make a mosaic-map out of them. My plan is to start with two photos, align the second with the first, and create an "initial mosaic" out of the two aligned images. Once that is done, I then align the third photo with the initial mosaic, and then align the fourth photo with the result of that, etc, thereby progressively constructing the map.

I have two techniques for doing this, but the more accurate one, which makes use of calcOpticalFlowPyrLK(), only works for the two-image phase because the two input images must be the same size. Because of that I tried a new solution, but it is less accurate and the error introduced at every step piles up, eventually producing a nonsensical result.

My question is two-fold, but if you know the answer to one, you don't have to answer both, unless you want to. First, is there a way to use something similar to calcOpticalFlowPyrLK() but with two images of different sizes (this includes any potential workarounds)? And second, is there a way to modify the detector/descriptor solution to make it more accurate?

Here's the accurate version that works only for two images:

# load images
base = cv2.imread("images/1.jpg")
curr = cv2.imread("images/2.jpg")

# convert to grayscale
base_gray = cv2.cvtColor(base, cv2.COLOR_BGR2GRAY)

# find the coordinates of good features to track  in base
base_features = cv2.goodFeaturesToTrack(base_gray, 3000, .01, 10)

# find corresponding features in current photo
curr_features = np.array([])
curr_features, pyr_stati, _ = cv2.calcOpticalFlowPyrLK(base, curr, base_features, curr_features, flags=1)

# only add features for which a match was found to the pruned arrays
base_features_pruned = []
curr_features_pruned = []
for index, status in enumerate(pyr_stati):
    if status == 1:
        base_features_pruned.append(base_features[index])
        curr_features_pruned.append(curr_features[index])

# convert lists to numpy arrays so they can be passed to opencv function
bf_final = np.asarray(base_features_pruned)
cf_final = np.asarray(curr_features_pruned)

# find perspective transformation using the arrays of corresponding points
transformation, hom_stati = cv2.findHomography(cf_final, bf_final, method=cv2.RANSAC, ransacReprojThreshold=1)

# transform the images and overlay them to see if they align properly
# not what I do in the actual program, just for use in the example code
# so that you can see how they align, if you decide to run it
height, width = curr.shape[:2]
mod_photo = cv2.warpPerspective(curr, transformation, (width, height))
new_image = cv2.addWeighted(mod_photo, .5, base, .5, 1)

Here's the inaccurate one that works for multiple images (until the error becomes too great):

# load images
base = cv2.imread("images/1.jpg")
curr = cv2.imread("images/2.jpg")


# convert to grayscale
base_gray = cv2.cvtColor(self.base, cv2.COLOR_BGR2GRAY)

# DIFFERENCES START
curr_gray = cv2.cvtColor(self.curr_photo, cv2.COLOR_BGR2GRAY)

# create detector, get keypoints and descriptors
detector = cv2.ORB_create()
base_keys, base_desc = detector.detectAndCompute(base_gray, None)
curr_keys, curr_desc = detector.detectAndCompute(curr_gray, None)

matcher = cv2.DescriptorMatcher_create("BruteForce-Hamming")

max_dist = 0.0
min_dist = 100.0

for match in matches:
     dist = match.distance
     min_dist = dist if dist < min_dist else min_dist
     max_dist = dist if dist > max_dist else max_dist

good_matches = [match for match in matches if match.distance <= 3 * min_dist ]

base_matches = []
curr_matches = []
for match in good_matches:
    base_matches.append(base_keys[match.queryIdx].pt)
    curr_matches.append(curr_keys[match.trainIdx].pt)

bf_final = np.asarray(base_matches)
cf_final = np.asarray(curr_matches)

# SAME AS BEFORE

# find perspective transformation using the arrays of corresponding points
transformation, hom_stati = cv2.findHomography(cf_final, bf_final, method=cv2.RANSAC, ransacReprojThreshold=1)

# transform the images and overlay them to see if they align properly
# not what I do in the actual program, just for use in the example code
# so that you can see how they align, if you decide to run it
height, width = curr.shape[:2]
mod_photo = cv2.warpPerspective(curr, transformation, (width, height))
new_image = cv2.addWeighted(mod_photo, .5, base, .5, 1)

Finally, here are some images that I'm using:

Homographies compose. So if you have the homography `h12` from img1 to img2 and the homography `h23` from img2 to img3, then `h12.dot(h23)` is the homography from img1 to img3. — alkasm, Jul 18 '17 at 10:31

alkasm · Accepted Answer · 2018-11-22T10:29:48.830

Homographies compose, so if you have the homographies between img1 and img2 and between img2 and img3 then the composition of those two homographies gives the homography between img1 and img3.

Your sizes are off of course because you're trying to match img3 to the stitched image containing img1 and img2. But you don't need to do that. Don't stitch them until you have all the homographies between each successive pair of images. Then you can proceed in one of two ways; work from the back or work from the front. I'll use for e.g. h31 to refer to the homography which warps img3 into coordinates of img1.

From the front (pseudocode):

warp img2 into coordinates of img1 with h21
warp img3 into coordinates of img1 with h31 = h32 @ h21
warp img4 into coordinates of img1 with h41 = h43 @ h31
...
stitch/blend images together

Here @ is the matrix multiplication operator, which will achieve our homography composition (note that it is safest to divide by the final entry in the homography to ensure that they're all scaled the same).

From the back (pseudocode):

...
warp prev stitched img into coordinates of img3 with h43
stitch warped stitched img with img3
warp prev stitched img into coordinates of img2 with h32
stitch warped stitched img with img2
warp prev stitched img into coordinates of img1 with h21
stitch warped stitched img with img1

The idea is either you start from the front, and warp everything into the first images coordinate frame, or start from the back, warp to the previous image and stitch, and then warp that stitched image into the previous image, and repeat. I think the first method is probably easier. In either case you have to worry about the propagation of errors in your homography estimation as they will build up over multiple composed homographies.

This is the naïve approach to blend multiple images together with just the homographies. The more sophisticated method is to use bundle adjustment, which takes into account feature points across all images. Then for good blending the steps are gain compensation to remove camera gain adjustments and vignetting, and then multi-band blending to prevent blurring. See the seminal paper from Brown and Lowe here and a brilliant example and free demo software here.

how to do dot product of homogrpahy Mat in c++. The function Mat.dot(Mat2) returns a double as per SDK ? — Manmohan Bishnoi, Nov 22 '18 at 09:03
@ManmohanBishnoi the `*` operator is overridden for matrix multiplication in OpenCV, so you can simply do `mat1 * mat2`. The `.dot()` method in OpenCV is purely for inner products of two vectors, which produces a scalar value. The `.dot()` method I'm using here is from Numpy, which does a matrix multiplication when the inputs are not 1-D arrays. It is badly named and I will edit my code accordingly! — alkasm, Nov 22 '18 at 10:26
Is your approach of stitching by homography multiplication, suitable for incremental panorama? — Manmohan Bishnoi, Nov 22 '18 at 11:00
@ManmohanBishnoi yes, that's what this answer is specifically referring to. However, bundle adjustment is a better method overall. — alkasm, Nov 23 '18 at 03:49

Python, OpenCV -- Aligning and overlaying multiple images, one after another

1 Answers1

Linked