Regarding Android's Mediacodec speed concerns and bottlenecks

Question

Currently, my encoding a 30 FPS video consisting of 90 frames using Mediacodec takes 2100-2400 ms. I'm using the code found here, except with the generateSurfaceFrame(i) part being replaced with:

private void generateFrame(Bitmap bitmap, Rect source)
    {
        long drawFrameStartTime = System.currentTimeMillis();
        Canvas canvas = mInputSurface.lockCanvas(null);
        canvas.drawRect(0, 0, mSquareDim, mSquareDim, clearPaint);
        //Process the canvas below
        try
        {
            canvas.drawBitmap(bitmap, source, new Rect(0, 0, mSquareDim, mSquareDim), antiAliasPaint);
        }
        //Process the canvas above
        catch(Exception e) {Log.e("renderExc", e.toString());}
        finally {mInputSurface.unlockCanvasAndPost(canvas);}
        long drawFrameEndTime = System.currentTimeMillis();
        Log.i("frame_draw_time", (drawFrameEndTime - drawFrameStartTime)+"");
    }

And the putting the frames into the MediaMuxer part with the code found and adapted from here - the one using the CircularBuffer class from Grafika. The muxer had to be released independently from the rest using that code.

I'm still concerned about Mediacodec's other bottlenecks when it comes to speed, though, and I'm targeting API 18 (minimum) at the moment. My questions are:

Should I start using Asynchronous mode, and how much faster can it be than Synchronous mode?
Is drawing the frames with OpenGL faster than the Surface-Canvas method described above?
Are there other bottlenecks in Mediacodec I should be concerned about?

Full source code will be provided when asked.

score 2 · Answer 1 · edited May 23 '17 at 12:31

2

@mstorsjo hit the high points. You can find an example of GLES-based video generation in Grafika, e.g. MovieEightRects uses the GeneratedMovie helper class.

If you measure the end-to-end video encoding time you will be measuring both throughput and latency. MediaCodec talks to a separate process (mediaserver) through IPC, which has to allocate hardware resources through an OMX driver. It takes a little time for this to warm up, and there's some amount of latency shoving frames through the codec.

Generating frames faster won't affect the overall encoding speed so long as you're generating as fast as the encoder can encode. The occasional stall when sending data to MediaMuxer will plug up the pipeline, hence the Horizon Camera blog post, so it's reasonable to worry about that (especially if your source drops frames when the encoding pipeline stalls).

edited May 23 '17 at 12:31

Community

1
1

answered Mar 03 '16 at 17:49

fadden

51,356
5
116
166

I reduced the TIMEOUT_USEC in the drainEncoder function to 1000 after your answer in the third link regarding the codec latency, and now it takes 1000-1400 ms. Why would dequeueOutputBuffer need a timeoutUs? Is it out of worry some frames would be dropped? And thanks for the prompt reply. – Gensoukyou1337 Mar 04 '16 at 03:20
I believe the issues with timeouts motivated the introduction of asynchronous mode. When single-threaded you're either in a mode where you either want to wait forever (because the input queue is full and you're just waiting for output to arrive) or not wait at all (because the input queue isn't full, but it's important to drain the output quickly). The movie-generation code works like this because it's creating content, encoding and, and draining it all on a single thread. If you look at CircularEncoder you'll see it uses a timeout of zero, because the threads are set up differently. – fadden Mar 04 '16 at 05:08
Indeed, if decreasing the timeout makes it run faster, it sounds like a case where the synchronous mode isn't used optimally. Ideally, one would have a zero timeout for both input and output, and once neither ends return any free buffers, do a nonzero timeout blocking wait for the output. Since for Surface input, one doesn't call dequeueInputBuffer manually and can't know quite as easily whether to proceed to feed another input frame or wait for output. In this case, I guess the asynchronous mode has some clear benefits. – mstorsjo Mar 04 '16 at 07:19

score 1 · Answer 2 · answered Mar 03 '16 at 14:08

Asynchronous mode in itself isn't faster than the synchronous mode (used correctly), but the asynchronous interface makes it clearer how it is supposed to be used. In particular, whichever interface you're using, do not wait for an output buffer directly after passing in one buffer/surface to the encoder - you can (and should) check for output, but don't block waiting for one - instead proceed to provide the next input buffer/surface instead (as long as the input doesn't block).
Rendering with OpenGL instead of Canvas should most probably be faster. See e.g. Android Graphics architecture, which says:

Note in particular that while the Canvas provided to a View's onDraw() method may be hardware-accelerated, the Canvas obtained when an app locks a Surface directly with lockCanvas() never is.

Thanks for the prompt reply. I myself am not experienced in OpenGL - the only experience I have is overlaying video frames with a bitmap by drawing a transparent textured rectangle over it. If I were to edit the video frames with gifs and text, should I do the same thing? — Gensoukyou1337, Mar 04 '16 at 03:24
By the way, how many milliseconds is the difference between drawing a 640x640 bitmap on a Canvas and rendering a 640x640 bitmap as a GL Texture on the Surface? — Gensoukyou1337, Mar 04 '16 at 07:38

Regarding Android's Mediacodec speed concerns and bottlenecks

2 Answers2

Linked