I am trying to create an image from an average of multiple images. The way I do this is to loop through the pixel value of 2 photos, add them together and divide by two. Simple math. However, while this is working, it is extremely slow (about 23 seconds to average 2x 10MP photos on a maximum specced MacBook Pro 15" 2016, compared to very much less time using Apples CIFilter API for similar algorithms). The code I'm currently using is this, based on another StackOverflow question here:
static func averageImages(primary: CGImage, secondary: CGImage) -> CGImage? {
guard (primary.width == secondary.width && primary.height == secondary.height) else {
return nil
}
let colorSpace = CGColorSpaceCreateDeviceRGB()
let width = primary.width
let height = primary.height
let bytesPerPixel = 4
let bitsPerComponent = 8
let bytesPerRow = bytesPerPixel * width
let bitmapInfo = RGBA32.bitmapInfo
guard let context = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else {
print("unable to create context")
return nil
}
guard let context2 = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else {
print("unable to create context 2")
return nil
}
context.draw(primary, in: CGRect(x: 0, y: 0, width: width, height: height))
context2.draw(secondary, in: CGRect(x: 0, y: 0, width: width, height: height))
guard let buffer = context.data else {
print("Unable to get context data")
return nil
}
guard let buffer2 = context2.data else {
print("Unable to get context 2 data")
return nil
}
let pixelBuffer = buffer.bindMemory(to: RGBA32.self, capacity: width * height)
let pixelBuffer2 = buffer2.bindMemory(to: RGBA32.self, capacity: width * height)
for row in 0 ..< Int(height) {
if row % 10 == 0 {
print("Row: \(row)")
}
for column in 0 ..< Int(width) {
let offset = row * width + column
let picture1 = pixelBuffer[offset]
let picture2 = pixelBuffer2[offset]
let minR = min(255,(UInt32(picture1.redComponent)+UInt32(picture2.redComponent))/2)
let minG = min(255,(UInt32(picture1.greenComponent)+UInt32(picture2.greenComponent))/2)
let minB = min(255,(UInt32(picture1.blueComponent)+UInt32(picture2.blueComponent))/2)
let minA = min(255,(UInt32(picture1.alphaComponent)+UInt32(picture2.alphaComponent))/2)
pixelBuffer[offset] = RGBA32(red: UInt8(minR), green: UInt8(minG), blue: UInt8(minB), alpha: UInt8(minA))
}
}
let outputImage = context.makeImage()
return outputImage
}
struct RGBA32: Equatable {
//private var color: UInt32
var color: UInt32
var redComponent: UInt8 {
return UInt8((color >> 24) & 255)
}
var greenComponent: UInt8 {
return UInt8((color >> 16) & 255)
}
var blueComponent: UInt8 {
return UInt8((color >> 8) & 255)
}
var alphaComponent: UInt8 {
return UInt8((color >> 0) & 255)
}
init(red: UInt8, green: UInt8, blue: UInt8, alpha: UInt8) {
let red = UInt32(red)
let green = UInt32(green)
let blue = UInt32(blue)
let alpha = UInt32(alpha)
color = (red << 24) | (green << 16) | (blue << 8) | (alpha << 0)
}
init(color: UInt32) {
self.color = color
}
static let red = RGBA32(red: 255, green: 0, blue: 0, alpha: 255)
static let green = RGBA32(red: 0, green: 255, blue: 0, alpha: 255)
static let blue = RGBA32(red: 0, green: 0, blue: 255, alpha: 255)
static let white = RGBA32(red: 255, green: 255, blue: 255, alpha: 255)
static let black = RGBA32(red: 0, green: 0, blue: 0, alpha: 255)
static let magenta = RGBA32(red: 255, green: 0, blue: 255, alpha: 255)
static let yellow = RGBA32(red: 255, green: 255, blue: 0, alpha: 255)
static let cyan = RGBA32(red: 0, green: 255, blue: 255, alpha: 255)
static let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Little.rawValue
static func ==(lhs: RGBA32, rhs: RGBA32) -> Bool {
return lhs.color == rhs.color
}
}
I'm not very experienced when it comes to working with RAW pixel values and there is probably room for much optimisation. The declaration of RGBA32
may not be required, but again I'm not sure how I'd go about simplifying the code. I've tried simply replacing that struct with a UInt32, however, as I divide by 2 the separation between the four channels gets messed up and I end up with the wrong result (on a positive note this brings the computing time down to about 6 seconds).
I've tried dropping the alpha channel (just hardcoding it to 255) and also dropping the safety checks that no values exceed 255. This has reduced the computing time to 19 seconds. However, it is far from the 6 seconds I was hoping to get close to and it would also be nice to average the alpha channel too.
Note: I am aware of CIFilters; however, darkening an image first, then using CIAdditionCompositing
filter does not work as the API provided by Apple is actually using a more complex algorithm than straight forward addition. For more details on this, see here for my previous code on the subject and a similar question here with testing proving that Apple's API is not a straight forward addition of pixel values.
**Edit: ** Thanks to all the feedback I have now been able to make vast improvements. The by far biggest difference was to change from debug to release, that dropped the time by a lot. Then, I was able to write faster code for the modification of the RGBA values, eliminating the need for a separate struct for this. That changed the time from 23 seconds to about 10 (plus the debug to release improvements). The code now looks like this, also being rewritten a bit to look more readable:
static func averageImages(primary: CGImage, secondary: CGImage) -> CGImage? {
guard (primary.width == secondary.width && primary.height == secondary.height) else {
return nil
}
let colorSpace = CGColorSpaceCreateDeviceRGB()
let width = primary.width
let height = primary.height
let bytesPerPixel = 4
let bitsPerComponent = 8
let bytesPerRow = bytesPerPixel * width
let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue | CGBitmapInfo.byteOrder32Little.rawValue
guard let primaryContext = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo),
let secondaryContext = CGContext(data: nil, width: width, height: height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo) else {
print("unable to create context")
return nil
}
primaryContext.draw(primary, in: CGRect(x: 0, y: 0, width: width, height: height))
secondaryContext.draw(secondary, in: CGRect(x: 0, y: 0, width: width, height: height))
guard let primaryBuffer = primaryContext.data, let secondaryBuffer = secondaryContext.data else {
print("Unable to get context data")
return nil
}
let primaryPixelBuffer = primaryBuffer.bindMemory(to: UInt32.self, capacity: width * height)
let secondaryPixelBuffer = secondaryBuffer.bindMemory(to: UInt32.self, capacity: width * height)
for row in 0 ..< Int(height) {
if row % 10 == 0 {
print("Row: \(row)")
}
for column in 0 ..< Int(width) {
let offset = row * width + column
let primaryPixel = primaryPixelBuffer[offset]
let secondaryPixel = secondaryPixelBuffer[offset]
let red = (((primaryPixel >> 24) & 255)/2 + ((secondaryPixel >> 24) & 255)/2) << 24
let green = (((primaryPixel >> 16) & 255)/2 + ((secondaryPixel >> 16) & 255)/2) << 16
let blue = (((primaryPixel >> 8) & 255)/2 + ((secondaryPixel >> 8) & 255)/2) << 8
let alpha = ((primaryPixel & 255)/2 + (secondaryPixel & 255)/2)
primaryPixelBuffer[offset] = red | green | blue | alpha
}
}
print("Done looping")
let outputImage = primaryContext.makeImage()
return outputImage
}
As for multithreading, I am going to run this function several times, and will therefore implement the multithreading over the iterations of the function rather than within the function itself. I do expect to get an even greater performance boost from this, but it also has to be balanced with the increased memory allocation of having more images in memory at the same time.
Thanks to everyone who contributed to this. Since all feedback has been through comments I can't mark any of them as the right answer. I also don't want to post my updated code as an answer as I wasn't the one who really made the answer. Any suggestions on how to proceed?