Fastest way to calculate average RGB pixel value for AVCaptureVideoDataOutput feed - CPU/GPU

Question

I want the average pixel value for the entire image in the feed from AVCaptureVideoDataOutput, and I'm currently catching the image and looping through pixels to sum them.

I was wondering if there's a more efficient way to do this with the GPU/openGL, given that this is a parallelisable image processing task. (perhaps a heavy gaussian blur, and read the central pixel value?)

One specific requirement is for a high precision result, making use of the high level of averaging. Note the CGFloat result below.

Current swift 2 code:

Edit: Added an implementation with CIAreaAverage, as suggested below by Simon. It's separated by the useGPU bool.

func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {

    var redmean:CGFloat = 0.0;
    var greenmean:CGFloat = 0.0;
    var bluemean:CGFloat = 0.0;

    if (useGPU) {
            let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
            let cameraImage = CIImage(CVPixelBuffer: pixelBuffer!)
            let filter = CIFilter(name: "CIAreaAverage")
            filter!.setValue(cameraImage, forKey: kCIInputImageKey)
            let outputImage = filter!.valueForKey(kCIOutputImageKey) as! CIImage!

            let ctx = CIContext(options:nil)
            let cgImage = ctx.createCGImage(outputImage, fromRect:outputImage.extent)

            let rawData:NSData = CGDataProviderCopyData(CGImageGetDataProvider(cgImage))!
            let pixels = UnsafePointer<UInt8>(rawData.bytes)
            let bytes = UnsafeBufferPointer<UInt8>(start:pixels, count:rawData.length)
            var BGRA_index = 0
            for pixel in UnsafeBufferPointer(start: bytes.baseAddress, count: bytes.count) {
                switch BGRA_index {
                case 0:
                    bluemean = CGFloat (pixel)
                case 1:
                    greenmean = CGFloat (pixel)
                case 2:
                    redmean = CGFloat (pixel)
                case 3:
                    break
                default:
                    break
                }
                BGRA_index++

            }
     } else {
            let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
            CVPixelBufferLockBaseAddress(imageBuffer!, 0)

            let baseAddress = CVPixelBufferGetBaseAddressOfPlane(imageBuffer!, 0)
            let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer!)
            let width = CVPixelBufferGetWidth(imageBuffer!)
            let height = CVPixelBufferGetHeight(imageBuffer!)
            let colorSpace = CGColorSpaceCreateDeviceRGB()

            let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.PremultipliedFirst.rawValue).rawValue | CGBitmapInfo.ByteOrder32Little.rawValue

            let context = CGBitmapContextCreate(baseAddress, width, height, 8, bytesPerRow, colorSpace, bitmapInfo)
            let imageRef = CGBitmapContextCreateImage(context)
            CVPixelBufferUnlockBaseAddress(imageBuffer!, 0)
            let data:NSData = CGDataProviderCopyData(CGImageGetDataProvider(imageRef))!
            let pixels = UnsafePointer<UInt8>(data.bytes)
            let bytes = UnsafeBufferPointer<UInt8>(start:pixels, count:data.length)
            var redsum:CGFloat = 0
            var greensum:CGFloat  = 0
            var bluesum:CGFloat  = 0
            var BGRA_index = 0
            for pixel in UnsafeBufferPointer(start: bytes.baseAddress, count: bytes.count) {
            switch BGRA_index {
            case 0:
                bluesum += CGFloat (pixel)
            case 1:
                greensum += CGFloat (pixel)
            case 2:
                redsum += CGFloat (pixel)
            case 3:
                //alphasum += UInt64(pixel)
                break
            default:
                break
            }

            BGRA_index += 1
            if BGRA_index == 4 { BGRA_index = 0 }
        }
        redmean = redsum / CGFloat(bytes.count)
        greenmean = greensum / CGFloat(bytes.count)
        bluemean = bluesum / CGFloat(bytes.count)            
        }

print("R:\(redmean) G:\(greenmean) B:\(bluemean)")

If you're handed a surface from that API (I'm not familiar with it), you should be able to feed it through OpenGL's explicit mipmap generation. It'll them proceed to average 1/4 resolution mipmaps in succession down to the final LOD: 1x1. That last LOD is your average. I don't know how it's implemented on iOS or OS X though, so performance might be the same or worse. — Andon M. Coleman, Sep 23 '15 at 07:49

score 6 · Answer 1 · edited May 23 '17 at 11:47

The issue and the reason for the poor performance of your CIAreaAverage filter is the missing definition of the input extent. As a consequence the output of the filter has the same size as the input image and therefore you loop over a full-blown image instead of a 1-by-1 pixel image. Therefore the execution takes the same amount of time as your initial version.

As described in the documentation of CIAreaAverage you can specify an inputExtent parameter. How this can done in swift can be found in this answer of a similar question:

    let cameraImage = CIImage(CVPixelBuffer: pixelBuffer!)
    let extent = cameraImage.extent
    let inputExtent = CIVector(x: extent.origin.x, y: extent.origin.y, z: extent.size.width, w: extent.size.height)
    let filter = CIFilter(name: "CIAreaAverage", withInputParameters: [kCIInputImageKey: cameraImage, kCIInputExtentKey: inputExtent])!
    let outputImage = filter.outputImage!

If you want to squeeze out even more performance, you can ensure that you reuse your CIContext, instead of recreating it for each captured frame.

This works great, but I found that this leaks memory if not enclosed in an `autoreleasepool` — YourMJK, Jul 09 '21 at 02:17

score 2 · Answer 2 · answered Sep 23 '15 at 08:42

2

There's a Core Image filter that does this very job, CIAreaAverage, which returns a single-pixel image that contains the average color for the region of interest (your region of interest will be the entire image).

FYI, I have a blog post that discusses applying Core Image filters to a live camera feed here. In a nutshell, the filter requires a CIImage which you can create inside captureImage based on sampleBuffer:

let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let cameraImage = CIImage(CVPixelBuffer: pixelBuffer!)

...and it's that cameraImage you'll need to pass to CIAreaAverage.

Cheers,

Simon

answered Sep 23 '15 at 08:42

Flex Monkey

3,583
17
19

Good suggestion! I've just implemented it, and just about to add the code to the question. I'd love to get float precision out, and currently this method gives me int. Any ideas? – Ian Sep 23 '15 at 09:27
Interestingly this appears to run slower than my manual loop. I am running this on an iPhone 5, so it may be related to the relative GPU performance on the 5 – Ian Sep 23 '15 at 09:34
Ugh - sorry about that. I know some of the Core Image filters are Metal backed and Metal on a 5s can be a little ropey. Let me know if your previous question still holds or if you've given up on that approach. – Flex Monkey Sep 23 '15 at 09:48
1

I'd like to keep trying this approach. I'll be getting a 6S soon, and it's OK for me to require newer devices for this app. If you have any idea for float, I'd appreciate your thoughts. Cheers – Ian Sep 23 '15 at 09:51
Generally speaking, when you pass `kCGBitmapFloatComponents` as an option to `CGBitmapContextCreate`, you get floating point data. You need to orchestrate the other parameters like bit depth etc. too, obviously. – MirekE Sep 23 '15 at 15:19

MirekE · Answer 3 · 2015-09-23T07:45:52.877

1

If you had your data as floating point values, you could use

func vDSP_meanv

If that's not an option, try working with the data in a way so that the optimizer can use SIMD instructions. I don't have any good recipe for that, it has been trial and error exercise for me, but certain rearrangings of the code may give better chance than others. For example, I would try removing the switch from the loop. The SIMD will vectorize your calculations and in addition you can use multithreading via GCD by processing each row of the image data on a separate core...

edited Sep 23 '15 at 07:45

answered Sep 23 '15 at 07:40

MirekE

11,515
5
35
28

vDSP_meanv looks great... I started implementing that but got distracted by @simon-gladman's suggestion. If you think vDSP would be a quicker route, I'll happily persevere – Ian Sep 23 '15 at 09:36
I would start with Simon's suggestion. Looks really cool. vDSP is CPU accelerated only, btw. – MirekE Sep 23 '15 at 14:52
how to use vDSP_meanv ? – Rubaiyat Jahan Mumu Apr 25 '17 at 11:15

Fastest way to calculate average RGB pixel value for AVCaptureVideoDataOutput feed - CPU/GPU

Current swift 2 code:

3 Answers3