I have a project where I use ScreenCaptureKit. For various reasons out of the scope of the question, the format that I configure ScreenCaptureKit to use is kCVPixelFormatType_32BGRA
-- I need the raw BGRA data, which gets manipulated later on.
When I construct a CGImage
or NSImage
from the data, displays and some windows look fine (full code included at the bottom of the question -- this is just an excerpt of the conversion).
guard let cvPixelBuffer = sampleBuffer.imageBuffer else { return }
CVPixelBufferLockBaseAddress(cvPixelBuffer, .readOnly)
defer { CVPixelBufferUnlockBaseAddress(cvPixelBuffer, .readOnly) }
let vImageBuffer: vImage_Buffer = vImage_Buffer(data: CVPixelBufferGetBaseAddress(cvPixelBuffer),
height: vImagePixelCount(CVPixelBufferGetHeight(cvPixelBuffer)),
width: vImagePixelCount(CVPixelBufferGetWidth(cvPixelBuffer)),
rowBytes: CVPixelBufferGetWidth(cvPixelBuffer) * 4)
let cgImageFormat: vImage_CGImageFormat = vImage_CGImageFormat(
bitsPerComponent: 8,
bitsPerPixel: 32,
colorSpace: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.last.rawValue),
renderingIntent: .defaultIntent
)!
if let cgImage: CGImage = try? vImageBuffer.createCGImage(format: cgImageFormat) {
let nsImage = NSImage(cgImage: cgImage, size: .init(width: CGFloat(cgImage.width), height: CGFloat(cgImage.height)))
Task { @MainActor in
self.image = nsImage
}
}
The resulting image for displays looks reasonable (except for incorrect color, since the incoming data is BGRA and the CGImage expects RGBA -- that's dealt with elsewhere in my project).
However, some windows (not all) get a very odd distortion and tearing effect. Here's Calendar.app for example:
Here is Mail.app, which is less broken:
As far as I can tell, the formats of the CVPixelBuffer
are the same in each case. When I inspect the CVPixelBuffer
using the debugger (instead of doing the conversion to CGImage
/NSImage
) the CVPixelBuffer
displays perfectly in QuickLook, so it's not that the actual data is damaged either -- there's just something about the format I don't understand.
Question:
How can I get the RGBA data reliably from these windows in the same way that it is always returned for displays?
Full, runnable sample code:
class ScreenCaptureManager: NSObject, ObservableObject {
@Published var availableWindows: [SCWindow] = []
@Published var availableDisplays: [SCDisplay] = []
@Published var image: NSImage?
private var stream: SCStream?
private let videoSampleBufferQueue = DispatchQueue(label: "com.sample.VideoSampleBufferQueue")
func getAvailableContent() {
Task { @MainActor in
do {
let availableContent: SCShareableContent = try await SCShareableContent.excludingDesktopWindows(true,
onScreenWindowsOnly: true)
self.availableWindows = availableContent.windows
self.availableDisplays = availableContent.displays
} catch {
print(error)
}
}
}
func basicStreamConfig() -> SCStreamConfiguration {
let streamConfig = SCStreamConfiguration()
streamConfig.minimumFrameInterval = CMTime(value: 1, timescale: 5)
streamConfig.showsCursor = true
streamConfig.queueDepth = 5
streamConfig.pixelFormat = kCVPixelFormatType_32BGRA
return streamConfig
}
func startCaptureForDisplay(display: SCDisplay) {
Task { @MainActor in
try? await stream?.stopCapture()
let filter = SCContentFilter(display: display, including: availableWindows)
let streamConfig = basicStreamConfig()
streamConfig.width = Int(display.frame.width * 2)
streamConfig.height = Int(display.frame.height * 2)
stream = SCStream(filter: filter, configuration: streamConfig, delegate: self)
do {
try stream?.addStreamOutput(self, type: .screen, sampleHandlerQueue: videoSampleBufferQueue)
try await stream?.startCapture()
} catch {
print("ERROR: ", error)
}
}
}
func startCaptureForWindow(window: SCWindow) {
Task { @MainActor in
try? await stream?.stopCapture()
let filter = SCContentFilter(desktopIndependentWindow: window)
let streamConfig = basicStreamConfig()
streamConfig.width = Int(window.frame.width * 2)
streamConfig.height = Int(window.frame.height * 2)
stream = SCStream(filter: filter, configuration: streamConfig, delegate: self)
do {
try stream?.addStreamOutput(self, type: .screen, sampleHandlerQueue: videoSampleBufferQueue)
try await stream?.startCapture()
} catch {
print(error)
}
}
}
}
extension ScreenCaptureManager: SCStreamOutput, SCStreamDelegate {
func stream(_: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of _: SCStreamOutputType) {
guard let cvPixelBuffer = sampleBuffer.imageBuffer else { return }
print("PixelBuffer", cvPixelBuffer)
CVPixelBufferLockBaseAddress(cvPixelBuffer, .readOnly)
defer {
CVPixelBufferUnlockBaseAddress(cvPixelBuffer, .readOnly)
}
let vImageBuffer: vImage_Buffer = vImage_Buffer(data: CVPixelBufferGetBaseAddress(cvPixelBuffer),
height: vImagePixelCount(CVPixelBufferGetHeight(cvPixelBuffer)),
width: vImagePixelCount(CVPixelBufferGetWidth(cvPixelBuffer)),
rowBytes: CVPixelBufferGetWidth(cvPixelBuffer) * 4)
let cgImageFormat: vImage_CGImageFormat = vImage_CGImageFormat(
bitsPerComponent: 8,
bitsPerPixel: 32,
colorSpace: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGBitmapInfo(rawValue: CGImageAlphaInfo.last.rawValue),
renderingIntent: .defaultIntent
)!
if let cgImage: CGImage = try? vImageBuffer.createCGImage(format: cgImageFormat) {
let nsImage = NSImage(cgImage: cgImage, size: .init(width: CGFloat(cgImage.width), height: CGFloat(cgImage.height)))
Task { @MainActor in
self.image = nsImage
}
}
}
func stream(_: SCStream, didStopWithError error: Error) {
print("JN: Stream error", error)
}
}
struct ContentView: View {
@StateObject private var screenCaptureManager = ScreenCaptureManager()
var body: some View {
HStack {
ScrollView {
ForEach(screenCaptureManager.availableDisplays, id: \.displayID) { display in
HStack {
Text("Display: \(display.width) x \(display.height)")
}.frame(height: 60).frame(maxWidth: .infinity).border(Color.black).contentShape(Rectangle())
.onTapGesture {
screenCaptureManager.startCaptureForDisplay(display: display)
}
}
ForEach(screenCaptureManager.availableWindows.filter { $0.title != nil && !$0.title!.isEmpty }, id: \.windowID) { window in
HStack {
Text(window.title!)
}.frame(height: 60).frame(maxWidth: .infinity).border(Color.black).contentShape(Rectangle())
.onTapGesture {
screenCaptureManager.startCaptureForWindow(window: window)
}
}
}
.frame(width: 200)
Divider()
if let image = screenCaptureManager.image {
Image(nsImage: image)
.resizable()
.aspectRatio(contentMode: .fit)
.frame(maxWidth: .infinity, maxHeight: .infinity)
}
}
.frame(width: 800, height: 600, alignment: .leading)
.onAppear {
screenCaptureManager.getAvailableContent()
}
}
}
(Note: I know that displaying an NSImage of the captured content is not the most efficient method of previewing the content -- it's merely for showing the issue here)