15

I'm trying to take two images using the camera, and align them using the iOS Vision framework:

func align(firstImage: CIImage, secondImage: CIImage) {
  let request = VNTranslationalImageRegistrationRequest(
      targetedCIImage: firstImage) {
    request, error in
    if error != nil {
      fatalError()
    }
    let observation = request.results!.first
        as! VNImageTranslationAlignmentObservation
    secondImage = secondImage.transformed(
        by: observation.alignmentTransform)
    let compositedImage = firstImage!.applyingFilter(
        "CIAdditionCompositing",
        parameters: ["inputBackgroundImage": secondImage])
    // Save the compositedImage to the photo library.
  }

  try! visionHandler.perform([request], on: secondImage)
}

let visionHandler = VNSequenceRequestHandler()

But this produces grossly mis-aligned images:

enter image description here

enter image description here

enter image description here

You can see that I've tried three different types of scenes — a close-up subject, an indoor scene, and an outdoor scene. I tried more outdoor scenes, and the result is the same in almost every one of them.

I was expecting a slight misalignment at worst, but not such a complete misalignment. What is going wrong?

I'm not passing the orientation of the images into the Vision framework, but that shouldn't be a problem for aligning images. It's a problem only for things like face detection, where a rotated face isn't detected as a face. In any case, the output images have the correct orientation, so orientation is not the problem.

My compositing code is working correctly. It's only the Vision framework that's a problem. If I remove the calls to the Vision framework, put the phone of a tripod, the composition works perfectly. There's no misalignment. So the problem is the Vision framework.

This is on iPhone X.

How do I get Vision framework to work correctly? Can I tell it to use gyroscope, accelerometer and compass data to improve the alignment?

Kartick Vaddadi
  • 4,818
  • 6
  • 39
  • 55
  • I'd be curious to see how the program aligns a subset/portion of a picture to the whole picture. Are the alignments stochastic (does the output vary despite the input being the same)? Alignment programs often make approximations/simplifications to reduce computation time. Stochastic programming is a way to compensate. – Ghoti Mar 12 '18 at 22:26
  • @Kartick How did you end up doing this? – user16930239 Sep 25 '21 at 23:17
  • @Jabbar I gave up. – Kartick Vaddadi Sep 26 '21 at 02:13

2 Answers2

0

You should set secondImage as targetImage, and perform handler with firstImage.

I use your composite way.

user38155
  • 61
  • 3
0

check out this example from MLBoy:

let request = VNTranslationalImageRegistrationRequest(targetedCIImage: image2, options: [:])

let handler = VNImageRequestHandler(ciImage: image1, options: [:])
do {
try handler.perform([request])
} catch let error {
print(error)
}

guard let observation = request.results?.first as? VNImageTranslationAlignmentObservation else { return }
let alignmentTransform = observation.alignmentTransform

image2 = image2.transformed(by: alignmentTransform)
let compositedImage = image1.applyingFilter("CIAdditionCompositing", parameters: ["inputBackgroundImage": image2])

Example

user16930239
  • 6,319
  • 2
  • 9
  • 33