0

I am trying to create an application in Python that needs to capture the contents of a window application that uses the GPU to generate an image.

I initially tried using ffmpeg with gdigrab to capture the window content, but it results in a gray screen, which after some research, I believe is because ffmpeg cannot capture content from a window that uses GPU acceleration.

I then tried using the mss library in Python and although I could capture the entire screen or a specified region without a problem, capturing content of a specific application window that overlaps with other windows results in capturing all overlapped windows, not just the target application.

Interestingly, OBS Studio can capture the window content correctly when the "Capture Method" is set to "Windows 10 (1903 and newer)", but not when it's set to "BitBlt (Windows 7 and newer)". However, OBS does not offer a native API for Python that could allow for easy integration with other applications.

Is there a way to capture the content of a window that uses GPU acceleration in Python, using ffmpeg or other library, app?

Any advice or direction would be appreciated.



EDIT: I found an option that manages to get the image directly from the application window: https://stackoverflow.com/a/76399855/14861684

I tried to implement this solution in my application using the capture_win_alt() function to capture the frame. Its output redirects to ffmpeg which creates a recording from the received frames. Unfortunately, the quality of the recording is not satisfactory - that is, for example, a recording that should have 60s and 60 fps (which gives a total of 360 frames per recording) and in fact has only about 5 seconds and looks very over-squeezed.

It seems to me that it's as if we wanted to record 10 seconds at 20 FPS but only downloaded one frame per second which gives us a total of 10 frames downloaded (instead of 200) by which the recording is converted to 20 FPS which gives 10/20 = 0.5s recording instead of 10s. By this action, the recording is significantly shortened and looks very sped up.

My current solution/implementation seems very inefficient. How to synchronize the frame rate with the amount of FPS in the recording? How to protect for the case of lost frames - insert black screen or / some way to dynamic FPS?

PS: I care about very good recording accuracy because my application simultaneously tracks various data and records it with timestamp - so as to connect the occurrence of an event with the moment of recording (matching).

Xsarin
  • 1
  • 1

1 Answers1

0

There's a way to capture the screen using python like pyautogui. However it doesn't work with hardware accelerated apps (which uses the GPU) because they directly render to the screen, bypassing the typical frame buffer that pyautogui is able to capture.

To capture such applications, you'd typically need to use more advanced methods, such as using game capture software or using APIs like DirectX or Vulkan, which can be much more complex and are typically not done in Python.

Piepypye
  • 117
  • 1
  • 13
  • Hi, thank you for your reply and explanation. I found some "simple" way to retrieve the frame from the application window but after implementation I am not satisfied with the result. I described it in the post. Maybe you will be able to help? – Xsarin Jun 20 '23 at 09:29