Using CoreMLHelpers as an inspiration. We can create a C function that does what you need. Based on your pixel format requirements, I think this solution will be the most efficient option. I used an AVCaputureVideoDataOutput
for testing.
I hope this helps!
AVCaptureVideoDataOutputSampleBufferDelegate
implementation. The majority of the work here is creating a centered-cropping rectangle. Making use of AVMakeRectWithAspectRatioInsideRect
is key (it does exactly what you want).
- (void)captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection; {
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
if (pixelBuffer == NULL) { return; }
size_t height = CVPixelBufferGetHeight(pixelBuffer);
size_t width = CVPixelBufferGetWidth(pixelBuffer);
CGRect videoRect = CGRectMake(0, 0, width, height);
CGSize scaledSize = CGSizeMake(299, 299);
// Create a rectangle that meets the output size's aspect ratio, centered in the original video frame
CGRect centerCroppingRect = AVMakeRectWithAspectRatioInsideRect(scaledSize, videoRect);
CVPixelBufferRef croppedAndScaled = createCroppedPixelBuffer(pixelBuffer, centerCroppingRect, scaledSize);
// Do other things here
// For example
CIImage *image = [CIImage imageWithCVImageBuffer:croppedAndScaled];
// End example
CVPixelBufferRelease(croppedAndScaled);
}
Method 1: Data manipulation and Accelerate
The basic premise of this function is that it first crops to the specified rectangle then scales to the final desired size. The cropping is achieved by simply ignoring the data outside the rectangle. Scaling is achieved through Accelerate's vImageScale_ARGB8888
function. Again, thanks to CoreMLHelpers
for the insight.
void assertCropAndScaleValid(CVPixelBufferRef pixelBuffer, CGRect cropRect, CGSize scaleSize) {
CGFloat originalWidth = (CGFloat)CVPixelBufferGetWidth(pixelBuffer);
CGFloat originalHeight = (CGFloat)CVPixelBufferGetHeight(pixelBuffer);
assert(CGRectContainsRect(CGRectMake(0, 0, originalWidth, originalHeight), cropRect));
assert(scaleSize.width > 0 && scaleSize.height > 0);
}
void pixelBufferReleaseCallBack(void *releaseRefCon, const void *baseAddress) {
if (baseAddress != NULL) {
free((void *)baseAddress);
}
}
// Returns a CVPixelBufferRef with +1 retain count
CVPixelBufferRef createCroppedPixelBuffer(CVPixelBufferRef sourcePixelBuffer, CGRect croppingRect, CGSize scaledSize) {
OSType inputPixelFormat = CVPixelBufferGetPixelFormatType(sourcePixelBuffer);
assert(inputPixelFormat == kCVPixelFormatType_32BGRA
|| inputPixelFormat == kCVPixelFormatType_32ABGR
|| inputPixelFormat == kCVPixelFormatType_32ARGB
|| inputPixelFormat == kCVPixelFormatType_32RGBA);
assertCropAndScaleValid(sourcePixelBuffer, croppingRect, scaledSize);
if (CVPixelBufferLockBaseAddress(sourcePixelBuffer, kCVPixelBufferLock_ReadOnly) != kCVReturnSuccess) {
NSLog(@"Could not lock base address");
return nil;
}
void *sourceData = CVPixelBufferGetBaseAddress(sourcePixelBuffer);
if (sourceData == NULL) {
NSLog(@"Error: could not get pixel buffer base address");
CVPixelBufferUnlockBaseAddress(sourcePixelBuffer, kCVPixelBufferLock_ReadOnly);
return nil;
}
size_t sourceBytesPerRow = CVPixelBufferGetBytesPerRow(sourcePixelBuffer);
size_t offset = CGRectGetMinY(croppingRect) * sourceBytesPerRow + CGRectGetMinX(croppingRect) * 4;
vImage_Buffer croppedvImageBuffer = {
.data = ((char *)sourceData) + offset,
.height = (vImagePixelCount)CGRectGetHeight(croppingRect),
.width = (vImagePixelCount)CGRectGetWidth(croppingRect),
.rowBytes = sourceBytesPerRow
};
size_t scaledBytesPerRow = scaledSize.width * 4;
void *scaledData = malloc(scaledSize.height * scaledBytesPerRow);
if (scaledData == NULL) {
NSLog(@"Error: out of memory");
CVPixelBufferUnlockBaseAddress(sourcePixelBuffer, kCVPixelBufferLock_ReadOnly);
return nil;
}
vImage_Buffer scaledvImageBuffer = {
.data = scaledData,
.height = (vImagePixelCount)scaledSize.height,
.width = (vImagePixelCount)scaledSize.width,
.rowBytes = scaledBytesPerRow
};
/* The ARGB8888, ARGB16U, ARGB16S and ARGBFFFF functions work equally well on
* other channel orderings of 4-channel images, such as RGBA or BGRA.*/
vImage_Error error = vImageScale_ARGB8888(&croppedvImageBuffer, &scaledvImageBuffer, nil, 0);
CVPixelBufferUnlockBaseAddress(sourcePixelBuffer, kCVPixelBufferLock_ReadOnly);
if (error != kvImageNoError) {
NSLog(@"Error: %ld", error);
free(scaledData);
return nil;
}
OSType pixelFormat = CVPixelBufferGetPixelFormatType(sourcePixelBuffer);
CVPixelBufferRef outputPixelBuffer = NULL;
CVReturn status = CVPixelBufferCreateWithBytes(nil, scaledSize.width, scaledSize.height, pixelFormat, scaledData, scaledBytesPerRow, pixelBufferReleaseCallBack, nil, nil, &outputPixelBuffer);
if (status != kCVReturnSuccess) {
NSLog(@"Error: could not create new pixel buffer");
free(scaledData);
return nil;
}
return outputPixelBuffer;
}
Method 2: CoreImage
This method is much simpler to read, and has the benefit of being pretty agnostic to the pixel buffer format you pass in, which is a plus for certain use cases. Granted, you're limited to which formats CoreImage supports.
CVPixelBufferRef createCroppedPixelBufferCoreImage(CVPixelBufferRef pixelBuffer,
CGRect cropRect,
CGSize scaleSize,
CIContext *context) {
assertCropAndScaleValid(pixelBuffer, cropRect, scaleSize);
CIImage *image = [CIImage imageWithCVImageBuffer:pixelBuffer];
image = [image imageByCroppingToRect:cropRect];
CGFloat scaleX = scaleSize.width / CGRectGetWidth(image.extent);
CGFloat scaleY = scaleSize.height / CGRectGetHeight(image.extent);
image = [image imageByApplyingTransform:CGAffineTransformMakeScale(scaleX, scaleY)];
// Due to the way [CIContext:render:toCVPixelBuffer] works, we need to translate the image so the cropped section is at the origin
image = [image imageByApplyingTransform:CGAffineTransformMakeTranslation(-image.extent.origin.x, -image.extent.origin.y)];
CVPixelBufferRef output = NULL;
CVPixelBufferCreate(nil,
CGRectGetWidth(image.extent),
CGRectGetHeight(image.extent),
CVPixelBufferGetPixelFormatType(pixelBuffer),
nil,
&output);
if (output != NULL) {
[context render:image toCVPixelBuffer:output];
}
return output;
}
Creating the CIContext can be done at the call site or it can be created and stored on a property. For information about options, see the documentation.
// Create a CIContext using default settings, this will
// typically use the GPU and Metal by default if supported
if (self.context == nil) {
self.context = [CIContext context];
}