3

I recently started working with OpenCV. While I made a lot of progress in the beginning, I'm now close to losing my mind over what appears to be a simple task: Recording a simple video using the Raspberry Pi cam. The problem is that the resulting video appears to be fast forward, and the reason seems to be that not half the necessary frames are actually written upon recording.

After many an hour of experimenting with codecs and time-measuring my code in order to find the bottleneck, I now discovered that the problem appears to be somewhat related to the OpenCV VideoCapture class, whose instance in my code actually delivers far less frames than expected.

So I wrote a simple piece that counts the number of frames delivered by VideoCapture in five seconds. Setting the captures' properties to 640x480x30fps works fine and delivers around 150 frames. But dialing it up to 1920x1080x30fps (which is a valid camera mode according to the specs and works fine in other applications) ends up with only around 15 frames in 5 seconds.

There is probably a very obvious solution, but I'm totally blanking. Can anyone help me? Thanks!

#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/highgui.hpp>
#include <ctime>

float getElapsedCPUTime(std::clock_t begin){
    return float(clock() - begin)/CLOCKS_PER_SEC;
}

std::time_t getCurrentWallTime(){
    return std::time(nullptr);
}

int main (){ 
//  int cols(640);
//  int rows(480);
    int cols(1920);
    int rows(1080);
    cv::Mat currentFrame;

    // set capture properties
    cv::VideoCapture cap(0);
    cap.set(CV_CAP_PROP_FRAME_HEIGHT, rows);
    cap.set(CV_CAP_PROP_FRAME_WIDTH, cols);
    cap.set(cv::CAP_PROP_FPS, 30);
    cap.set(cv::CAP_PROP_FOURCC, 0x21);

    // control capture properties
    int rows_c(cap.get(CV_CAP_PROP_FRAME_HEIGHT));
    int cols_c(cap.get(CV_CAP_PROP_FRAME_WIDTH));
    int fps(cap.get(cv::CAP_PROP_FPS));
    std::cout << "rows: " << rows_c << ", cols " << cols_c << ", fps " << fps << ", CPS: " << CLOCKS_PER_SEC << std::endl;

    int cnt(0);
    std::time_t loopExecution_begin(getCurrentWallTime());
    while(1){
        std::string msg("");

        // capture frame
        std::clock_t capture_begin(clock());
        cap >> currentFrame;
        float time_for_capture = getElapsedCPUTime(capture_begin);
        ++cnt;

        // get elapsed wall time
        std::time_t loopRunTime = getCurrentWallTime() - loopExecution_begin;

        // output message
        msg += "#: " + std::to_string(cnt);
        msg += "\tTicks begin: " + std::to_string(capture_begin);
        msg += "\tCapturetime: " + std::to_string(time_for_capture) + "s";
        msg += "\tLoop Runtime: " + std::to_string(loopRunTime) + "s";
        std::cout << msg << std::endl;

        // break after 5s
        if (loopRunTime > 5.0) break;

    }   
}

Edit: This is the output:

rows: 1080, cols 1920, fps 30, CPS: 1000000
#: 1    Ticks begin: 362378 Capturetime: 0.055826s  Loop Runtime: 1s
#: 2    Ticks begin: 418543 Capturetime: 0.022631s  Loop Runtime: 1s
#: 3    Ticks begin: 441338 Capturetime: 0.022695s  Loop Runtime: 1s
#: 4    Ticks begin: 464196 Capturetime: 0.023302s  Loop Runtime: 2s
#: 5    Ticks begin: 487659 Capturetime: 0.022729s  Loop Runtime: 2s
#: 6    Ticks begin: 510551 Capturetime: 0.022631s  Loop Runtime: 2s
#: 7    Ticks begin: 533349 Capturetime: 0.022663s  Loop Runtime: 2s
#: 8    Ticks begin: 556176 Capturetime: 0.023194s  Loop Runtime: 3s
#: 9    Ticks begin: 579535 Capturetime: 0.022640s  Loop Runtime: 3s
#: 10   Ticks begin: 602337 Capturetime: 0.023267s  Loop Runtime: 3s
#: 11   Ticks begin: 625789 Capturetime: 0.022741s  Loop Runtime: 3s
#: 12   Ticks begin: 648694 Capturetime: 0.023210s  Loop Runtime: 3s
#: 13   Ticks begin: 672069 Capturetime: 0.022487s  Loop Runtime: 4s
#: 14   Ticks begin: 694721 Capturetime: 0.023162s  Loop Runtime: 4s
#: 15   Ticks begin: 718051 Capturetime: 0.022611s  Loop Runtime: 4s
#: 16   Ticks begin: 740822 Capturetime: 0.023602s  Loop Runtime: 4s
#: 17   Ticks begin: 764600 Capturetime: 0.022555s  Loop Runtime: 5s
#: 18   Ticks begin: 787321 Capturetime: 0.022532s  Loop Runtime: 5s
#: 19   Ticks begin: 810019 Capturetime: 0.022626s  Loop Runtime: 5s
#: 20   Ticks begin: 832813 Capturetime: 0.023161s  Loop Runtime: 5s
#: 21   Ticks begin: 856138 Capturetime: 0.022543s  Loop Runtime: 6s
  • Which camera are you using and how is it connected - CSI/USB? What is that FOUR_CC code you are using and where did you get it from? – Mark Setchell Nov 08 '18 at 10:01
  • @MarkSetchell: I'm using the [Raspberry Camera Module v2](https://www.raspberrypi.org/products/camera-module-v2/). It connect's to it's own port on the Pis board. The FOUR_CC 0x21 apparently is H264 and I'm using it as a result of another battle I'm fighting with OpenCVs VideoWriter. It took me a while to find out that file extension needs to fit the FOUR_CC. In search for a mapping of FOUR_CC to valid extensions (you don't happen to have one?), I came across this here [SO thread](https://stackoverflow.com/questions/8727992/ffmpeg-fourcc-avi-codec-support-list) and the 0x21 FOUR_CC. – TonySoprano Nov 08 '18 at 19:53
  • This doesn't make sense. Your frames takes 25ms to acquire, yet you can only acquire 4 per second. You should be able to acquire 40 per second if they take 25ms each. What is your Raspberry Pi doing for the other 900ms each second? It might be informative to output the time in ticks at the start of each frame. – Mark Setchell Nov 09 '18 at 10:37
  • @MarkSetchell: If only I knew that. I also put the ticks in the output, but I'm not getting any more sense out of it. Any ideas? – TonySoprano Nov 09 '18 at 18:24
  • If the last frame starts at 850,000 ticks and the first frame starts at 350,000 ticks, the whole thing takes 500,000 ticks. If you have 1,000,000 CLOCKS_PER_SEC, surely your whole thing only lasts 0.5 seconds? Does it run for 5 seconds? Or 0.5s? – Mark Setchell Nov 09 '18 at 19:39
  • Runs for five seconds according to my phones stopwatch. – TonySoprano Nov 10 '18 at 14:16

1 Answers1

1

It seems like you want to have a h.264 stream from the camera.

OpenCV is unbeatable when you want to do some processing on the camera frames, but I would not necessarily use it for a task requiring to get H.264, since the Pi has a hardware h.264 encoder and I am not sure that gets used in your case.

For getting a h.264 output to a file, or to stream it somewhere, you can either use the standard raspivid application and pipe its output where you want, or if you need more control, you can take it's source code and modify it.

I got lazy and did the former. The code below streams the H.264 stream to a TCP socket. If there is congestion (i.e. the data can't be written to the socket fast enough, the code will close the socket, after which my client re-connects). The approach may not be extremely elegant, but I have been using this for a while and it seems to work OK.

#include "CameraRelay.h"
#include <arpa/inet.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <sys/select.h>
#include <poll.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <thread>
#include <mutex>
#include <memory>
#include <stdlib.h>
#include <signal.h>
#include <sys/wait.h>

#define READ 0
#define WRITE 1

pid_t popen2(const char *command, char * const args[], int *infp, int *outfp) {
  int p_stdin[2], p_stdout[2];
  pid_t pid;

  if (pipe(p_stdin) != 0 || pipe(p_stdout) != 0)
    return -1;

  pid = fork();

  if (pid < 0)
    return pid;
  else if (pid == 0)
  {
    close(p_stdin[WRITE]);
    dup2(p_stdin[READ], READ);
    close(p_stdout[READ]);
    dup2(p_stdout[WRITE], WRITE);

    execv(command, args);
    perror("execl");
    exit(1);
  }

  if (infp == NULL)
    close(p_stdin[WRITE]);
  else
    *infp = p_stdin[WRITE];

  if (outfp == NULL)
    close(p_stdout[READ]);
  else
    *outfp = p_stdout[READ];

  return pid;
}

namespace CameraRelay {
  static std::unique_ptr<std::thread> s_thread;
  static std::mutex s_startStopMutex;
  static bool s_bRun;
  int s_listensock = -1;

  void RelayThread() {
    signal(SIGPIPE, SIG_IGN);
    s_listensock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (s_listensock < 0) {
      perror("socket() for camera relay failed");
      return;
    }

    int on = 1;
    setsockopt(s_listensock, SOL_SOCKET, SO_REUSEADDR, (char *)&on, sizeof(on));
    ioctl(s_listensock, FIONBIO, (char *)&on);

    unsigned int bufsz = 1024; // small buffer to control latency
    setsockopt(s_listensock, SOL_SOCKET, SO_SNDBUF, (void *)&bufsz, sizeof(bufsz));

    struct sockaddr_in sa;
    memset((char *)&sa, 0, sizeof(sa));
    sa.sin_family = AF_INET;
    sa.sin_addr.s_addr = htonl(INADDR_ANY);
    sa.sin_port = htons(7124);
    if (bind(s_listensock, (struct sockaddr *)&sa, sizeof(struct sockaddr)) < 0) {
      perror("bind() for camera relay failed");
      return;
    }
    bufsz = 1024; // small buffer to control latency
    setsockopt(s_listensock, SOL_SOCKET, SO_SNDBUF, (void *)&bufsz, sizeof(bufsz));
    if (listen(s_listensock, 1)) {
      perror("listen() for camera relay failed");
      return;
    }
    while (s_bRun) {
      sockaddr_in sa;
      memset((char *)&sa, 0, sizeof(sa));
      socklen_t len = sizeof(sa);
      struct pollfd pfd;
      memset(&pfd, 0, sizeof(pfd));
      pfd.fd = s_listensock;
      pfd.events = POLLIN;
      int ret = poll(&pfd, 1, 500);
      if (ret > 0) {        
        int sock = accept(s_listensock, (struct sockaddr *)&sa, &len);
        unsigned int bufsz = 16384; // small buffer to control latency
        setsockopt(sock, SOL_SOCKET, SO_SNDBUF, (void *)&bufsz, sizeof(bufsz));
        fcntl(sock, F_SETFL, fcntl(sock, F_GETFL, 0) | O_NONBLOCK);
        printf("camera relay socket got new connection.\n");
        char buffer[512];
        std::string cmd = "/usr/bin/raspivid";
        const char *args[] = { "/usr/bin/raspivid", "-o", "-", "-n" , "-s", "-t" ,"0", "-fl", "-hf", "-b", "5000000", "-w", "1280", "-h",  "720", "-fps", "25", nullptr };
        int pin = 0;
        int pout = 0;
        pid_t pid = popen2(cmd.c_str(), const_cast<char **>(args), &pin, &pout);
        if (pid <= 0) {
          perror("popen() failed!");
          break;
        }
        fcntl(pout, F_SETPIPE_SZ, 1024); // small buffer to control latency
        bool bFailed = false;
        int nWait = 0;
        while (s_bRun && !bFailed) {
          struct pollfd pfd2[2];
          memset(&pfd2, 0, sizeof(pfd2));
          pfd2[0].fd = sock;
          pfd2[0].events = POLLIN;
          pfd2[1].fd = pout;
          pfd2[1].events = POLLIN;
          poll(pfd2, 2, 500);

          if (pfd2[0].revents) {
            char rcv;
            if (recv(sock, &rcv, 1, 0)) {
            }
          }

          if (pfd2[1].revents) {
            int nRead = read(pout, buffer, sizeof(buffer));
            if (nRead <= 0) {
              break;
            }
            int nSent = 0;
            while (nSent < nRead) {
              int ret = send(sock, buffer + nSent, nRead - nSent, 0);
              if ((ret < 0 && errno != EWOULDBLOCK) || !s_bRun) {
                perror("camera relay send() failed");
                bFailed = true;
                break;
              }
              if (!nSent && ret == nRead) {
                nWait = 0;
              }
              nSent += ret;
              if (nSent < nRead) {
                memset(&pfd, 0, sizeof(pfd));
                pfd.fd = sock;
                pfd.events = POLLOUT;                
                poll(&pfd, 1, 500);
                nWait++;
                if (nWait > 2) {
                  bFailed = true;
                  break;
                }
              } 
            }
          }
        }
        if (pout > 0)
          close(pout);
        if (pin > 0)
          close(pin);
        if (pid > 0) {                    
          printf("Killing raspivid child process %d with SIGKILL...\n", pid);
          kill(pid, SIGINT);
          sleep(1);
          kill(pid, SIGKILL);
          int status = 0;
          printf("Waiting for process to end...\n");
          waitpid(pid, &status, 0);
          printf("Killed raspivid child process\n");
        }
        if (sock > 0)
          close(sock);        
      }
    }    
  }

  bool Open() {
    if (s_thread)
      return false;
    system("killall raspivid");
    s_bRun = true;
    s_thread = std::unique_ptr<std::thread>(new std::thread(RelayThread));
    return true;
  }

  void Close() {
     if (!s_thread)
      return;
    s_bRun = false;
    s_thread->join();
  }
};
Sami Sallinen
  • 3,203
  • 12
  • 16
  • That‘s a pretty interesting approach! I won‘t be able to work on it during the next two days, but I‘ll have a go with it afterwards. As for the codec, I‘m not bound to h264, it was simply the only one I was able to get to work with the OpenCV VideoWriter (s. comment above). Is there any insight or document about what codecs work well with OpenCV? – TonySoprano Nov 10 '18 at 14:21
  • H.264 is the only compresssion format that the Pi can both encode and decode in hardware, so if you want to store or stream compressed video data, H.264 is the way to go. – Sami Sallinen Nov 12 '18 at 07:03