3

I am trying to blend two images, given a mask, using the following script:

import cv2
import numpy as np

def pyramid_blend(A, B, m, num_levels):
    GA = A.copy()
    GB = B.copy()
    GM = m.copy()

    gpA = [GA]
    gpB = [GB]
    gpM = [GM]

    for i in xrange(num_levels):
        GA = cv2.pyrDown(GA)
        GB = cv2.pyrDown(GB)
        GM = cv2.pyrDown(GM)

        gpA.append(np.float32(GA))
        gpB.append(np.float32(GB))
        gpM.append(np.float32(GM))

    lpA = [gpA[num_levels - 1]]
    lpB = [gpB[num_levels - 1]]
    gpMr = [gpM[num_levels - 1]]

    for i in xrange(num_levels - 1, 0, -1):
        size = (gpA[i - 1].shape[1], gpA[i - 1].shape[0])

        LA = np.subtract(gpA[i - 1], cv2.pyrUp(gpA[i], dstsize=size))
        LB = np.subtract(gpB[i - 1], cv2.pyrUp(gpB[i], dstsize=size))

        lpA.append(LA)
        lpB.append(LB)

        gpMr.append(gpM[i - 1])

    LS = []
    for la, lb, gm in zip(lpA, lpB, gpMr):
        ls = la * gm + lb * (1.0 - gm)
        LS.append(ls)

    ls_ = LS[0]
    for i in xrange(1, num_levels):
        size = (LS[i].shape[1], LS[i].shape[0])
        ls_ = cv2.add(cv2.pyrUp(ls_, dstsize=size), np.float32(LS[i]))

    return ls_

if __name__ == '__main__':

    A = cv2.imread('./black.jpg')
    B = cv2.imread('./white.jpg')
    m = cv2.imread('./mask.jpg')

    lpb = pyramid_blend(A, B, m, 6)

What i did:

  • Find the Gaussian Pyramids of the images.
  • From Gaussian Pyramids, find their Laplacian Pyramids
  • Join the left half and right half of images in each levels of Laplacian Pyramids using mask.
  • From this joint image pyramids, reconstruct the original image.

The images are used -

https://i.stack.imgur.com/nbY7B.jpg

https://i.stack.imgur.com/i2rj7.jpg

https://i.stack.imgur.com/v6QGM.jpg

The result i get -

https://i.stack.imgur.com/AgcOh.jpg

For some reason, and i dont understand why, the colors of the result image are completely off.

api55
  • 11,070
  • 4
  • 41
  • 57
Alex Goft
  • 1,114
  • 1
  • 11
  • 23
  • 2
    If you open an image with OpenCV, the colours are loaded in BGR order, whereas the rest of the world expects RGB order, see here... https://stackoverflow.com/q/52494592/2836621 – Mark Setchell Sep 25 '18 at 10:54

2 Answers2

4

I can detect two problems:

  1. You assume your mask is 0 or 1.0, but it is actually 0 or 255, when loading your mask you can do the following:

    m[m==255]=1.0
    
  2. You are probably displaying float32 images with imshow.... Just convert it to np.uint8 to display it...

    lpb = np.uint8(lpb)
    

Saying that, still you probably have another error, but I do not see it right now, since it still have some tiny parts with weird colors, and also I expect the blend to work more smooth in the middle. Here is my result:

enter image description here


UPDATE

It looks like when you do too many levels, the artifacts appear, when you have only 3 (instead of 6) weird coloring appears (with the fixes stated above as well). Probably the colors needs to be saturated perhaps when subtracting?

api55
  • 11,070
  • 4
  • 41
  • 57
2

In case anyone finds this thread in the future and wonders about what was wrong with the above code, the reason for the image artifacts is because the returned image, due to floating point rounding error, has some values above 255 and some values below 0. The ones below 0, when casting to uint8, get wrapped around to a large value, which is why some pixels get mangled.

See my commented and lightly modified version of the original code below. I only made three changes, including the mask change that api55 mentioned.

import cv2
import numpy as np

def pyramid_blend(A, B, m, num_levels):
    # 1. as in api55's answer, mask needs to be from 0 to 1, since you're multiplying a pixel value by it. Since mask
    # is binary, we only need to set all values which are 255 to 1
    m[m == 255] = 1

    GA = A.copy()
    GB = B.copy()
    GM = m.copy()

    gpA = [GA]
    gpB = [GB]
    gpM = [GM]

    for i in range(num_levels):
        GA = cv2.pyrDown(GA)
        GB = cv2.pyrDown(GB)
        GM = cv2.pyrDown(GM)

        gpA.append(np.float32(GA))
        gpB.append(np.float32(GB))
        gpM.append(np.float32(GM))

    lpA = [gpA[num_levels - 1]]
    lpB = [gpB[num_levels - 1]]
    gpMr = [gpM[num_levels - 1]]

    for i in range(num_levels - 1, 0, -1):
        size = (gpA[i - 1].shape[1], gpA[i - 1].shape[0])

        LA = np.subtract(gpA[i - 1], cv2.pyrUp(gpA[i], dstsize=size))
        LB = np.subtract(gpB[i - 1], cv2.pyrUp(gpB[i], dstsize=size))

        lpA.append(LA)
        lpB.append(LB)

        gpMr.append(gpM[i - 1])

    LS = []
    for la, lb, gm in zip(lpA, lpB, gpMr):
        ls = la * gm + lb * (1.0 - gm)
        LS.append(ls)

    ls_ = LS[0]
    for i in range(1, num_levels):
        size = (LS[i].shape[1], LS[i].shape[0])
        ls_ = cv2.add(cv2.pyrUp(ls_, dstsize=size), np.float32(LS[i]))
        # 2. because of floating point rounding error, some pixels in ls_ will be larger than 255, and some will be
        # lower than 0. When casting back to uint8, this causes pixels lower than 0 to get wrapped around to 255, so we
        # should threshold it before passing it back
        ls_[ls_ > 255] = 255; ls_[ls_ < 0] = 0

    # 3. when passing back, before saving and displaying, need to cast back to a uint8 from float64
    return ls_.astype(np.uint8)


if __name__ == '__main__':

    A = cv2.imread('./black.jpg')
    B = cv2.imread('./white.jpg')
    m = cv2.imread('./mask.jpg')

    lpb = pyramid_blend(A, B, m, 6)

    cv2.imshow('foo', lpb)
    cv2.waitKey()
    cv2.destroyAllWindows()

This results in the following image:

blended image

daveboat
  • 167
  • 4