I'm currently trying to write a script to detect text in an OBS video stream using Python/OpenCV.
From every n-th frame, I need to detect text in several specific boundaries (Example can be found in the attachment). The coordinates of these boundaries are constant for all video frames.
My questions:
- is OpenCV the best approach to solve my task?
- what OpenCV function should I use to specify multiple boundaries for text detection?
- is there a way to use a video stream from OBS as an input to my script?
Thank you for your help!