I'm trying to get started with the VP8 library, I'm not building it in the standard way they tell you to, I just loaded all of the main files and the "encoder" folder into a new Visual Studio C++ DLL project, and just included the C files in an extern "C" dll export function, which so far builds fine etc., I just have no idea where to start with the C++ API to encode, say, 3 frames of ARGB data into a very basic video, just to get started
The only example I could find is in the examples folder called simple_encoder.c
, although their premise is that they are loading in another file already and parsing its frames then converting it, so it seems a bit complicated, I just want to be able to pass in a byte array of a few ARGB frames and have it output a very simple VP8 video
I've seen How to encode series of images into VP8 using WebM VP8 Encoder API? (C/C++) but the accepted answer just links to the build instructions and references the general specification of the vp8 format, the closest I could find there is the example encoding parameters but I just want to do everything from C++ and I can't seem to find any other examples, besides for the default one simple_encoder.c
?
Just to cite some of the relevant parts I think I understand, but still need more help on
//in int main...
...
vpx_image_t raw;
if (!vpx_img_alloc(&raw, VPX_IMG_FMT_I420, info.frame_width,
info.frame_height, 1)) {
//"Failed to allocate image." error
}
So that part I think I understand for the most part, VPX_IMG_FMT_I420 is the only part that's not made in this file itself, but its in vpx_image.h, first as
#define VPX_IMG_FMT_PLANAR
//then after...
typedef enum vpx_img_fmt {
VPX_IMG_FMT_NONE,
VPX_IMG_FMT_RGB24, /**< 24 bit per pixel packed RGB */
///some other formats....
VPX_IMG_FMT_ARGB, /**< 32 bit packed ARGB, alpha=255 */
VPX_IMG_FMT_YV12 = VPX_IMG_FMT_PLANAR | VPX_IMG_FMT_UV_FLIP | 1, /**< planar YVU */
VPX_IMG_FMT_I420 = VPX_IMG_FMT_PLANAR | 2,
} vpx_img_fmt_t; /**< alias for enum vpx_img_fmt */
So I guess part of my question is answered already just from writing this, that one of the formats is VPX_IMG_FMT_ARGB, although I don't where where it's defined, but I'm guessing in the above code I would replace it with
const VpxInterface *encoder = get_vpx_encoder_by_name("v8");
vpx_image_t raw;
VpxVideoInfo info = { 0, 0, 0, { 0, 0 } };
info.frame_width = 1920;
info.frame_height = 1080;
info.codec_fourcc = encoder->fourcc;
info.time_base.numerator = 1;
info.time_base.denominator = 24;
bool didIt = vpx_img_alloc(&raw, VPX_IMG_FMT_ARGB,
info.frame_width, info.frame_height/*example width and height*/, 1)
//check didIt..
vpx_codec_enc_cfg_t cfg;
vpx_codec_ctx_t codec;
vpx_codec_err_t res;
res = vpx_codec_enc_config_default(encoder->codec_interface(), &cfg, 0);
//check if !res for error
cfg.g_w = info.frame_width;
cfg.g_h = info.frame_height;
cfg.g_timebase.num = info.time_base.numerator;
cfg.g_timebase.den = info.time_base.denominator;
cfg.rc_target_bitrate = 200;
VpxVideoWriter *writer = NULL;
writer = vpx_video_writer_open(outfile_arg, kContainerIVF, &info);
//check if !writer for error
bool startIt = vpx_codec_enc_init(&codec, encoder->codec_interface(), &cfg, 0);
//not even sure where codec was set actually..
//check !startIt for error starting
//now the next part in the original is where it reads from the input file, but instead
//I need to pass in an array of some ARGB byte arrays..
//thing is, in the next step they use a while loop for
//vpx_img_read(&raw, fopen("path/to/YV12formatVideo", "rb"))
//to set the contents of the raw vpx image allocated earlier, then
//they call another program that writes it to the writer object,
//but I don't know how to read the actual ARGB data directly into the raw image
//without using fopen, so that's one question (review at end)
//so I'll just put a placeholder here for the **question**
//assuming I have an array of byte arrays stored individually
//for simplicity sake
int size = 1920 * 1080 * 4;
uint8_t imgOne[size] = {/*some big byte array*/};
uint8_t imgTwo[size] = {/*some big byte array*/};
uint8_t imgThree[size] = {/*some big byte array*/};
uint8_t *images[] = {imgOne, imgTwo, imgThree};
int framesDone = 0;
int maxFrames = 3;
//so now I can replace the while loop with a filler function
//until I find out how to set the raw image with ARGB data
while(framesDone < maxFrames) {
magicalFunctionToSetARGBOfRawImage(&raw, images[framesDone]);
encode_frame(&codec, &raw, framesDone, 0, writer);
framesDone++;
}
//now apparently it needs to be flushed after
while(encode_frame(&codec, 0, -1, 0, writer)){}
vpx_img_free(&raw);
bool isDestroyed = vpx_codec_destroy(&codec);
//check if !isDestroyed for error
//now we gotta define the encode_Frames function, but simpler
//(and make it above other function for reference purposes
//or in header
static int encode_frame(
vpx_codex_ctx_t *coydek,
vpx_image_t pic,
int currentFrame,
int flags,
VpxVideoWriter *koysayv/*writer*/
) {
//now to substitute their encodeFrame function for
//the actual raw calls to simplify things
const DidIt = vpx_codec_encode(
coydek,
pic,
currentFrame,
1,//duration I think
flags,//whatever that is
VPX_DL_REALTIME//different than simlpe_encoder
);
if(!DidIt) return;//error here
vpx_codec_iter_t iter = 0;
const vpx_codec_cx_pkt_t *pkt = 0;
int gotThings = 0;
while(
(pkt = vpx_codec_get_cx_data(
coydek,
&iter
)) != 0
) {
gotThings = 1;
if(
pkt->kind
== VPX_CODEC_CX_FRAME_PKT //don't exactly
//understand this part
) {
const
int
keyframe = (
pkt
->
data
.frame
.flags
&
VPX_FRAME_IS_KEY
) != 0; //don'texactly understand the
//& operator here or how it gets the keyframe
bool wroteFrame = vpx_video_writer_write_frame(
koysayv,
pkt->data.frame.buf
//I'm guessing this is the encoded
//frame data
,
pkt->data.frame.sz,
pkt->data.frame.pts
);
if(!wroteFrame) return; //error
}
}
return gotThings;
}
Thing is though, I don't know how to actually read the
ARGB data into the RAW image buffer itself, as mentioned
above, in the original example, they use
vpx_img_read(&raw, fopen("path/to/file", "rb"))
but if I'm starting off with the byte arrays themselves
then what function do I use for that instead of the file?
I have a feeling it can be solved by the source code for the vpx_img_read found in tools_common.c function:
int vpx_img_read(vpx_image_t *img, FILE *file) {
int plane;
for (plane = 0; plane < 3; ++plane) {
unsigned char *buf = img->planes[plane];
const int stride = img->stride[plane];
const int w = vpx_img_plane_width(img, plane) *
((img->fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? 2 : 1);
const int h = vpx_img_plane_height(img, plane);
int y;
for (y = 0; y < h; ++y) {
if (fread(buf, 1, w, file) != (size_t)w) return 0;
buf += stride;
}
}
return 1;
}
although I personally am not experienced enough to necessarily know how to get a single frames ARGB data in, I think the key part is fread(buf, 1, w, file)
which seems to read parts of file
into buf
which represents img->planes[plane];
, which I think then by reading into buf that automatically reads into img->planes[plane];
, but I'm not sure if that is the case, and also not sure how to replace the fread from file to just take in a bye array that is alreasy loaded into memory...