I am currently writing a decoder for H264 video stream. Target platform is Android, so I am using MediaCodec API (Android OS >= 6.0).
I've tested my code on 4 devices (same one on all 4):
- It works nicely on Xiaomi Redmi 5 Plus (it's actually quite fast there).
- It works slow as hell on Nexus 7 and Samsung Galaxy Tab A
- It fails on Samsung Galaxy Tab S2 with mysterious error code -10000 from
AMediaCodec_dequeueOutputBuffer
(configure
andstart
return proper values (AMEDIA_OK
)).
So my questions are:
- Can I optimize it somehow? I tested each MediaCodec API call for time performance and it looks like
AMediaCodec_dequeueOutputBuffer
is a huge bottleneck here (80%-90% of time for each frame). - Is there anything I can do with this -10000 error on Galaxy Tab 2?
I read MediaCodecs docs and it's not described there. I've only found in VLC's sources (modules/codec/omxil/mediacodec_ndk.c) that
const AMEDIA_ERROR_UNKNOWN = -10000
(question 2.b: where did they found this constant?).
Devices specification (decoders from /etc/media_codecs.xml):
Xiaomi Redmi 5 Plus: Android 7.1.2 "video/avc" decoders: OMX.qcom.video.decoder.avc, OMX.qcom.video.decoder.avc.secure
Nexus 7 (tablet) Android 6.0.1 "video/avc" codecs: OMX.qcom.video.decoder.avc, OMX.qcom.video.decoder.avc.secure
Samsung Tab A
Android 7.1.1 "video/avc" decoders: OMX.qcom.video.decoder.avc, OMX.qcom.video.decoder.avc.secure, OMX.SEC.avc.sw.decSamsung Tab S2:
Android 7.0 "video/avc" decoders: OMX.Exynos.avc.dec, OMX.Exynos.avc.dec.secure, OMX.SEC.avc.sw.dec
I can see that all devices with proper execution (even if slow) have Qualcomm decoder in common.
My code:
//initialization (I omitted checks for errors, all initialization is executed without any errors.:
//f contains pointers to functions from libmediandk.so
const char mime[] = "video/avc";
mDecoder = f.createDecoderByType(mime);
AMediaFormat* mFormat = f.createMediaFormat();
const int colorFormat = 19; //COLOR_FormatYUV420Planar
f.setString(mFormat, c.keyMime, mime);
f.setInt32(mFormat, c.keyWidth, width);
f.setInt32(mFormat, c.keyHeight, height);
f.setInt32(mFormat, c.keyColorFormat, colorFormat);
f.setInt32(mFormat, "encoder", 0);
f.setInt32(mFormat, "max-input-size", 0);
//both sps and pps are extracted from the stream
f.setBuffer(mFormat, "csd-0", sps, sizeof(sps));
f.setBuffer(mFormat, "csd-1", pps, sizeof(pps));
media_status_t status = f.configure (mDecoder, mFormat, NULL, NULL, 0);
status = f.start(mDecoder);
f.deleteMediaFormat(mFormat);
lastOutputBufferIdx = -1;
//this is executed every loop
//data -> char* with this frame's H264 encoded data
//I omitted error check for clarity
const int TIMEOUT_US = -1; //-1 -> blocking mode
AMediaCodecBufferInfo info;
char* buf = NULL;
if (lastOutputBufferIdx != -1){
f.releaseOutputBuffer(mDecoder, lastOutputBufferIdx, false);
lastOutputBufferIdx = -1;
}
ssize_t iBufIdx = f.dequeueInputBuffer(mDecoder, TIMEOUT_US);
if (iBufIdx >= 0){
buf = f.getInputBuffer(mDecoder, iBufIdx, &bufsize);
int usedBufSize = 0;
if (buf){
usedBufSize = dataSize;
memcpy(buf, data, usedBufSize);
}
media_status_t res = f.queueInputBuffer(mDecoder, iBufIdx, 0, usedBufSize, getTimestamp(), 0);
}
//here's my nemesis (this line is both bottleneck and -10000 generator):
ssize_t oBufIdx = f.dequeueOutputBuffer(mDecoder, &info, TIMEOUT_US);
//I am not interested in processing any error codes from {-1,-2,-3}
//INFO_TRY_AGAIN_LATER, INFO_OUTPUT_FORMAT_CHANGED, INFO_OUTPUT_BUFFERS_CHANGED)
while (oBufIdx == -1 || oBufIdx == -2 || oBufIdx == -3){
oBufIdx = f.dequeueOutputBuffer(mDecoder, &info, TIMEOUT_US);
}
while (oBufIdx >= 0)
{
buf = f.getOutputBuffer(mDecoder, oBufIdx, &bufsize);
AMediaFormat format = f.getOutputFormat(mDecoder);
f.getInt32(format, "width", &width);
f.getInt32(format, "height", &height);
f.deleteMediaFormat(format);
//yuv_ is struct returned by my function
yuv_.data = buf + info.offset;
yuv_.size = bufsize;
yuv_.width = width;
yuv_.height = height;
yuv_.yPlane = yuv_.data + info.offset;
yuv_.uPlane = yuv_.yPlane + height * width;
yuv_.vPlane = yuv_.uPlane + (height * width) / 4;
yuv_.yStride = width;
yuv_.uStride = width / 2;
yuv_.vStride = width / 2;
}
lastOutputBufferIdx = oBufIdx;
I've seen that MediaCodec can be run in asynchronous mode (which could be a bit faster), but I am not sure if I can use it as I am decoding a live stream video instead of decoding some .mp4 from a hard drive. What I wanted to say is that there is (probably) no option to run decoding simultaneously.