I'm relatively new to C++, especially multi-threading, and have been playing with different options. I've gotten some stuff to work but ultimately I'm looking to maximize performance so maybe I think it'd be better to reach out to everyone else for what would be most effective and then go down that road.
I'm working on an application that will take a video stream and write an unmodified video file and a modified video file (there's some image processing that happens) to the local disk. There's also going to be some other threads to collect some other GPS data, etc, but nothing special.
The problem I'm running into is the framerate is limited mainly by the VideoWriter function in OpenCV. I know this can be greatly alleviated if I use a thread to write the frame to the VideoWriter object, that while the two VideoWriters can run simultaneously with each other and the image processing functions.
I've successfully created this function:
void frameWriter(Mat writeFrame, VideoWriter *orgVideo)
{
(orgVideo->write(writeFrame));
}
And it is called from within an infinite loop like so:
thread writeOrgThread(frameWriter, modFrame, &orgVideo, &orgVideoMutex);
writeOrgThread.join();
thread writeModThread(frameWriter, processMatThread(modFrame, scrnMsg1, scrnMsg2)
writeModThread.join();
Now having the .join() immediately underneath defeats the performance benefits, but without it I immediately get the error "terminate called without an active exception". I thought it would do what I needed if I put the join() functions above, so on the next loop it'd make sure the previous frame was written before writing the next, but then it behaves as if the join is not there (perhaps by the time the main task has made the full loop and gotten to the join, the thread is already terminated?). Also, using detach I think creates the issue that the threads are unsynchronized and then I run into these errors:
[mpeg4 @ 0000000000581b40] Invalid pts (156) <= last (156)
[mpeg4 @ 00000000038d5380] Invalid pts (158) <= last (158)
[mpeg4 @ 0000000000581b40] Invalid pts (158) <= last (158)
[mpeg4 @ 00000000038d5380] Invalid pts (160) <= last (160)
[mpeg4 @ 0000000000581b40] [mpeg4 @ 00000000038d5380] Invalid pts (160) <= last
(160)
Invalid pts (162) <= last (162)
I'm assuming this is because multiple threads are trying to access the same resource? Finally, I tried using mutex with detach to avoid above and I got a curious behavior where my sleep thread wasn't behaving properly and the frame rate was inconsistent .
void frameWriter(Mat writeFrame, VideoWriter *orgVideo, mutex *mutexVid)
{
(mutexVid->lock());
(orgVideo->write(writeFrame));
(mutexVid->unlock());
}
Obviously I'm struggling with thread synchronization and management of shared resources. I realize this is probably a rookie struggle, so if somebody tossed a tutorial link at me and told me go read a book I'd be OK with that. I guess what i'm looking for right now is some guidance as far as what specific method is going to get me the best performance in this situation and then I'll make that work.
Additionally, does anyone have a link for a very good tutorial that covers multithreading in C++ from a broader point of view (not limited to Boost or C++11 implmentation and covers mutexs, etc). It could greatly help me out with this.
Here's the 'complete' code, I stripped out some functions to make it easier to read, so don't mind the extra variable here and there:
//Standard libraries
#include <iostream>
#include <ctime>
#include <sys/time.h>
#include <fstream>
#include <iomanip>
#include <thread>
#include <chrono>
#include <mutex>
//OpenCV libraries
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
//Other libraries
//Namespaces
using namespace cv;
using namespace std;
// Code for capture thread
void captureMatThread(Mat *orgFrame, VideoCapture *stream1){
//loop infinitely
for(;;){
//capture from webcame to Mat orgFrame
(*stream1) >> (*orgFrame);
}
}
Mat processMatThread(Mat inFrame, string scrnMsg1, string scrnMsg2){
//Fancify image
putText(inFrame, scrnMsg1, cvPoint(545,450),CV_FONT_HERSHEY_COMPLEX,
0.5,CvScalar(255,255,0,255),1,LINE_8,false);
putText(inFrame, scrnMsg2, cvPoint(395,470),CV_FONT_HERSHEY_COMPLEX,
0.5,CvScalar(255,255,0,255),1,LINE_8,false);
return inFrame;
}
void frameWriter(Mat writeFrame, VideoWriter *orgVideo, mutex *mutexVid)
{
//(mutexVid->lock());
(orgVideo->write(writeFrame));
//(mutexVid->unlock());
}
long usecDiff(long usec1, long usec2){
if (usec1>usec2){
return usec1 - usec2;
}
else {
return (1000000 + usec1) - usec2;
}
}
int main()
{
//Start video capture
cout << "Opening camera stream..." << endl;
VideoCapture stream1(0);
if (!stream1.isOpened()) {
cout << "Camera failed to open!" << endl;
return 1;
}
//Message incoming image size
cout << "Camera stream opened. Incoming size: ";
cout << stream1.get(CV_CAP_PROP_FRAME_WIDTH) << "x";
cout << stream1.get(CV_CAP_PROP_FRAME_HEIGHT) << endl;
//File locations
const long fileSizeLimitBytes(10485760);
const int fileNumLimit(5);
const string outPath("C:\\users\\nag1\\Desktop\\");
string outFilename("out.avi");
string inFilename("in.avi");
//Declare variables for screen messages
timeval t1;
timeval t2;
timeval t3;
time_t now(time(0));
gettimeofday(&t1,0);
gettimeofday(&t2,0);
gettimeofday(&t3,0);
float FPS(0.0f);
const int targetFPS(60);
const long targetUsec(1000000/targetFPS);
long usec(0);
long usecProcTime(0);
long sleepUsec(0);
int i(0);
stringstream scrnMsgStream;
string scrnMsg1;
string scrnMsg2;
string scrnMsg3;
//Define images
Mat orgFrame;
Mat modFrame;
//Start Video writers
cout << "Creating initial video files..." << endl;
//Identify outgoing size, comments use incoming size
const int frame_width = 640; //stream1.get(CV_CAP_PROP_FRAME_WIDTH);
const int frame_height = 480; //stream1.get(CV_CAP_PROP_FRAME_HEIGHT);
//Message outgoing image size
cout << "Outgoing size: ";
cout << frame_width << "x" << frame_height << endl;
VideoWriter orgVideo(outPath + inFilename,CV_FOURCC('D','I','V','X'),targetFPS,
Size(frame_width,frame_height),true);
mutex orgVideoMutex;
VideoWriter modVideo(outPath + outFilename,CV_FOURCC('D','I','V','X'),targetFPS,
Size(frame_width,frame_height),true);
mutex modVideoMutex;
//unconditional loop
cout << "Starting recording..." << endl;
//Get first image to prevent exception
stream1.read(orgFrame);
resize(orgFrame,modFrame,Size(frame_width,frame_height));
// start thread to begin capture and populate Mat frame
thread captureThread(captureMatThread, &orgFrame, &stream1);
while (true) {
//Time stuff
i++;
if (i%2==0){
gettimeofday(&t1,0);
usec = usecDiff(t1.tv_usec,t2.tv_usec);
}
else{
gettimeofday(&t2,0);
usec = usecDiff(t2.tv_usec,t1.tv_usec);
}
now = time(0);
FPS = 1000000.0f/usec;
scrnMsgStream.str(std::string());
scrnMsgStream.precision(2);
scrnMsgStream << std::setprecision(2) << std::fixed << FPS;
scrnMsg1 = scrnMsgStream.str() + " FPS";
scrnMsg2 = asctime(localtime(&now));
//Get image
//Handled by captureMatThread now!!!
//stream1.read(orgFrame);
//resize image
resize(orgFrame,modFrame,Size(frame_width,frame_height));
//write original image to video
//writeOrgThread.join();
thread writeOrgThread(frameWriter, modFrame, &orgVideo, &orgVideoMutex);
//writeOrgThread.join();
writeOrgThread.detach();
//orgVideo.write(modFrame);
//write modified image to video
//writeModThread.join();
thread writeModThread(frameWriter, processMatThread(modFrame, scrnMsg1, scrnMsg2), &modVideo, &modVideoMutex);
//writeOrgThread.join();
//writeModThread.join();
writeModThread.detach();
//modVideo.write(processMatThread(modFrame, scrnMsg1, scrnMsg2));
//sleep
gettimeofday(&t3,0);
if (i%2==0){
sleepUsec = targetUsec - usecDiff(t3.tv_usec,t1.tv_usec);
}
else{
sleepUsec = targetUsec - usecDiff(t3.tv_usec,t2.tv_usec);
}
this_thread::sleep_for(chrono::microseconds(sleepUsec));
}
orgVideo.release();
modVideo.release();
return 0;
}
This is actually running on a raspberry pi (adapted to use raspberry pi camera) so my resources are limited and that's why i'm trying to minimize how many copies of the image there are and implement the parallel writing of the video files. You can see I've also experimented with placing both the 'join()'s after the "writeModThread", so at least the writing of the two files are in parallel. Perhaps that's the best we can do, but I plan to add a thread with all the image processing that I'd like to run in parallel (now you can see it called as a simple function that adds plain text).