I am developing a Windows application that is able to display a high-quality video feed, record it or take photos from it, and edit them later (up to 4K, in the near future maybe 8K). I currently have a working product, using WPF (C#). For capturing and displaying video, I used the AForge.NET library.
My problem is that the application is really slow, with the main performance hit coming from video rendering. Apparently the only way to do this, is to have a callback from the AForge library, providing a new frame every time one is available. That frame is then placed as an image inside an Image
element. I believe you can see where the performance hit comes from, especially for high-res imagery.
My experience with WPF and these enormous libraries has made me rethink how I want to program in general; I do not want to make bad software which takes up everyone's time by being slow (I refer to the Handmade network for more on "why?".
The problem is, camera capture and display was hell in WPF C#, but I do not seem to be better of anywhere else (on Windows, that is). An option would be for me to use mostly C++ and DirectShow. This is an okay-ish solution, but feels outdated in terms of performance, and is built upon Microsoft's COM system, which I prefer to avoid. There are options to render with hardware using Direct3D, but DirectShow and Direct3D do not play nicely together.
I have researched how other applications were able to achieve this. VLC uses DirectShow, but this only shows that DirectShow suffers from large latency. I assume this is because VLC was not intended for real-time purposes. OBS studio uses whatever QT uses, but I was unable to find how they do it. OpenCV grabs frames and blits them to the screen, not efficient at all, but that suffices for the OpenCV audience. Lastly, the integrated webcam app from Windows. For some reason this app is able to record and play back in real time, without a large performance hit. I was not able to figure out how they did this, nor did I find any other solution achieving comparable results to that tool.
TLDR; So my questions are: How would I go about efficiently capturing and rendering a camera stream, preferably hardware accelerated; Is it possible to do this on Windows without going through Directshow; And lastly, do I ask to much of commodity devices when I want them to process 4K footage in real-time?
I have not found anyone doing this in a way that suffices my needs; this makes me feel both desperate and guilty at the same time. I would have preferred to not bother StackOverflow with this problem.
Many thanks in advance, for an answer, or advice on this topic in general.