TrueDepth camera foreground segmentation, improving mask result

Question

I'm checking out an Apple project that demonstrates how to separate a person from the background using the TrueDepth front camera. You can see that here:

https://developer.apple.com/documentation/avfoundation/cameras_and_media_capture/enhancing_live_video_by_leveraging_truedepth_camera_data

It works pretty well, however sometimes if I rotate my face, the depth map loses some data points and clips parts of my face and ear. You can see that example here: https://streamable.com/cstex

Does anyone have any ideas on how to improve this? The AVPortraitEffectsMatte object is perfect if you use AVCapturePhotoOutput, however it doesn't seem usable for a live video feed as the processing time is too long to update frames.

I noticed the Clips app by Apple has perfect results and does not clip any of my face or ears, and delivers a good framerate: https://streamable.com/5n96h Since their app does not lose detail, it must not be relying solely on depth data. OR, they are running their depth data through a model to improve it (maybe similar to the proprietary model they use to generate AVPortraitEffectsMatte)

Any ideas on how to get a similar result, or how they achieved it?

TrueDepth camera foreground segmentation, improving mask result

0 Answers0