1

I'm currently trying to implement a compression algorithm(frame prediction) for an assignment. I am not looking for thumbnail files, or even just a shell command to generate something for me. My problem is specifically integrating it with a golang program.

I just started and I'm already stuck. I'm supposed to get each frame out of a video,divide it into I P and B frames and perform inter-coding(compress the frame itself), then perform intra-coding(between the frames).

Right now I cannot even get started on the above problems, because I have no idea how to read the video as something I could use in code. Apparently, the only library I can think of is ffmpeg. FFMPEG can get separate frames, apparently even i p and b frames.

ffmpeg -i <inputfile> -vf '[in]select=eq(pict_type\,B)[out]' b.frames.mp4

But this is just another video output, that I do not know how to open. What I was thinking of was outputting frames into bitmaps(?), then reading each bitmap separately, to reconstruct three 3D matrixes, of i frames, p frames and b frames. However this seems like quite a feat. Someone, somewhere has definitely tried to parse a video into a 3D matrix and has found a better solution than what I'm thinking of.

To be concise, I have a video, I need a 3D matrix. The 3D matrix is a matrix of 2D matrixes, which represent a frame in the video. Each point in a 3D matrix is a pixel(or whatever the equivalent is in videos).

3D matrix

Nephilim
  • 494
  • 5
  • 25
  • Use this to break the video into images: https://stackoverflow.com/questions/34786669/extract-all-video-frames-as-images-with-ffmpeg and then use this to make the matrix: https://stackoverflow.com/questions/33186783/get-a-pixel-array-from-from-golang-image-image Make a slice of matrices and it should be what you want. – Mikael Jun 19 '19 at 17:20
  • @Mikael so a theoretical X amounts of jpegs, created, X amount of readers then reading each jpeg, then piecing them together. This sounds very expensive, but I'll look into it, I'm getting desperate here. – Nephilim Jun 19 '19 at 21:02
  • As mentioned below, an option would be to make ffmpeg output YUV or RGB, so that you can then pipe the output and read it in your application. Some resources that might help you: [1](https://stackoverflow.com/a/42989472/1633924), [2](http://zulko.github.io/blog/2013/09/27/read-and-write-video-frames-in-python-using-ffmpeg/), [3](https://batchloaf.wordpress.com/2017/02/12/a-simple-way-to-read-and-write-audio-and-video-files-in-c-using-ffmpeg-part-2-video/). – mcont Jun 19 '19 at 21:20

1 Answers1

1

I/P/B frames only exist in the raw bitstream. Once the video is decoded, all frames are I frames. You probably want to use ffmprg to decode to something like yuv4mpegpipe then parse the output in your golang program.

szatmary
  • 29,969
  • 8
  • 44
  • 57
  • Mind explaining what yuv4mpegpipe does and why it would help me? All I'm finding is docs meant for people who know what they're looking for. Reminder, literally all I'm looking for is a way to get every pixel(luminosity value, since it's grayscale video), without dealing with metadata. – Nephilim Jun 19 '19 at 21:01
  • It is a method for encoding raw YUV data into a stream of frames. https://wiki.multimedia.cx/index.php/YUV4MPEG2 – szatmary Jun 19 '19 at 21:42