How does an output surface of a Decoder is passed to an input surface of an Encoder?

Question

I'm trying to understand how the surface-to-surface approach works with MediaCodec. In a ByteBuffer only approach, decoded data is placed in OutputBuffers. This non-encoded data can be processed manually then passed to the InputBuffers of an Encoder.

If we give a look at an example from Android MediaCodec CTS using a surface to surface approach to pass data between a decoder and an encoder, we configure the Decoder to output the decoded data onto a Surface called outputSurface, and we configure the Encoder to receive the data on a Surface called inputSurface.

In the documentation, the createInputSurface and the usage of this surface in the configuration of the Encoder is described as so:

createInputSurface(): Requests a Surface to use as the input to an encoder, in place of input buffers.

In other terms, and this is visible in the CTS example in the ByteBuffers declarations: there is just no InputBuffers for the Encoder. You have:

DecoderInputBuffers (receive the video track samples from the MediaExtractor)
DecoderOutputBuffers (output to pull decoded yuv frames)
Nothing. (Well... The input Surface.)
EncoderOutputBuffers (output to pull the re-encoded stuff to pass to a muxer)

Instead of enqueu-ing data in the Encoder InputBuffers, you have these line of codes:

outputSurface.awaitNewImage();
outputSurface.drawImage();
inputSurface.setPresentationTime(videoDecoderOutputBufferInfo.presentationTimeUs * 1000);
inputSurface.swapBuffers();

How is the ouputSurface content of the Decoder passed to the inputSurface of the Encoder? What is concretely happening behind the curtain?

Start here: https://source.android.com/devices/graphics/architecture.html — fadden, Feb 25 '16 at 17:18

score 4 · Accepted Answer · edited May 23 '17 at 11:45

4

The decoder's/encoder's output/input Surface respectively is a specially configured (either physically contiguous or reserved etc) piece of memory which specialised hardwares (for example, GPUs, hardware (accelerated) codecs) or software modules can use in a fashion best suited for performance needs (by using features such as hardware acceleration, DMA etc).

More specifically, in the current context for instance, the decoder's output Surface is backed by SurfaceTexture, so that it can be used in an OpenGL environment to be used as an external texture for any kind of processing before it is rendered on the Surface from which the encoder can read and encode to create the final video frame.

Not coincidentally, OpenGL can only render to such a Surface.

So the decoder acts as the provider of raw video frame, the Surface (Texture) the carrier, OpenGL the medium to render it to the Encoder's input Surface which is the destination for the (to be encoded) video frame.

To further satiate your curiosity, check Editing frames and encoding with MediaCodec for more details.

[Edit]

You can check subprojects in grafika Continuous Camera or Show + capture camera, which currently renders Camera frames (fed to SurfaceTexture) to a Video (and display). So essentially, the only change is the MediaCodec feeding frames to SurfaceTexture instead of the Camera.

Google CTS DecodeEditEncodeTest does exactly the same and can be used as a reference in order to make the learning curve smoother.

To start from the very basics, as fadden pointed out use Android graphics tutorials

edited May 23 '17 at 11:45

Community

1
1

answered Feb 25 '16 at 11:46

Sahil Bajaj

890
6
16

Wow, thanks! But where exactly has the input surface be connected to the decoder surface? Via the EglContext? – Léon Pelletier Feb 25 '16 at 18:51
1

What is it exactly that you need to do? The frames decoded to Surface texture can be rendered (if needed to be manipulated in any fashion via OpenGL. Plz check updated answer – Sahil Bajaj Feb 25 '16 at 19:07
Actually, it's not that I want to achieve something. I went through most of the existing documentation and tests and developed a rather rich video editor (in Xamarin.Android). But what I want is to be able to 'explain' why it works. Why both surfaces can communicate. It seems the link from Fadden is exactly what I need to read, though. – Léon Pelletier Feb 25 '16 at 19:17
1

Yes, it is the only proper (not to mention, awesomely written and put together) document on graphics subsystem architecture written by Fadden himself. :) – Sahil Bajaj Feb 26 '16 at 08:31
Hi! I am trying to implement something similar(using MediaProjection to get screen recording only need to record user selected area). What I understood is I have to apply transformations to SurfaceTexture. But where and how should I create surface for encoder. This input surface I need to supply to VirtualDisplay where I get the frames. I followed grafika example to record screen. I am trying for a week now. Could you please help. Thanks! – singularity Feb 18 '19 at 12:16

How does an output surface of a Decoder is passed to an input surface of an Encoder?

1 Answers1

Linked