17

I am looking for an example of decoding video on Raspberry Pi directly, without using OpenMAX.

This explains the different layers of multimedia software:

Raspberry Pi Architecture

There is an additional layer which is not shown in here, the "MMAL" layer which is (I believe) a Broadcom wrapper around OpenMAX. (If not, it would be an OpenMAX alternative, sitting on top of the kernel driver) raspivid and raspistill for example are written using MMAL.

I want an example of video decode where the input is raw H.264, and the output is either video in memory or video on screen. I want to do this using VCHIQ directly, not using OpenMAX. (Mainly for performance and flexibility reasons)

This github repository: https://github.com/raspberrypi/userland/ contains the source for everything shown above (the orange and green boxes; source for VCHIQ itself, OpenMAX IL implementation on top of VCHIQ, also OpenGL and EGL implementations, ...). So in theory it should be enough to get started. The problem is that it is highly non-obvious how to use it, even if one is very familiar with OpenMAX and with multimedia frameworks in general.

For example: vchiq_bulk_transmit() seems to be the function that one would use to send video to the decoder. But how to initialize the first argument of type VCHIQ_SERVICE_HANDLE_T? Where do the results go, in the framebuffer, or in a result handle, or... ?

EDIT The bounty can be collected by either providing a working example of video decode using vchiq, an API walkthrough that shows the calling sequence (even though not a working example) or a pointer to sufficient documentation to write this. A working example will get a hefty extra bounty :)

Alex I
  • 19,689
  • 9
  • 86
  • 158
  • 1
    Is there a particular reason not using OpenMAX? – drahnr Dec 13 '13 at 14:05
  • 1
    @drahnr: I want an API in which I get my decoded data immediately. OpenMAX IL has a bunch of buffers, there are no particular constraints on what an implementation may do with them, it may potentially buffer multiple frames and there is no way in the API to control that. I've seen implementations that are quite, quite slow to return data (throughput is still high, data is just delayed). ... I suppose if anyone wanted to have a shot at answering this that shows how to get decoded frames back using OpenMAX in less than 1/60th of a second on RPi, that would be fine too :) – Alex I Dec 13 '13 at 19:57
  • This is probably not what you want to hear, but implementing a H264 decoder is very challenging and I don't know a single soul that would code that for free (even if you had placed a 500 rep bounty). – karlphillip Dec 17 '13 at 02:47
  • 2
    @karlphillip: There is already a decoder. This is just a matter of calling a few functions in the VCHIQ api to set it up and then to pass it data and get the results. It is many orders of magnitude simpler than implementing a decoder. Thank you for looking! – Alex I Dec 17 '13 at 07:31
  • 1
    In theory the VCHIQ DMA could be avoided to remove some latency, however, except for very low resolution I doubt the RPI ARM could consume the data at a sustained 60 fps. It would be helpful to know what you are trying to do. Most of the high performance use-cases for Multimedia tend to involve configuring a pipeline between GPU hardware blocks with as few copies and format conversions as possible. – Tim Gover Jan 01 '14 at 13:22

2 Answers2

2

I don't have a working example, but I have an API walkthrough. Sort of..

Link to the full source code

I found the following function that demonstrate how you can call vchiq_bulk_transmit

int32_t vchi_bulk_queue_transmit(VCHI_SERVICE_HANDLE_T handle,
    void *data_src,
    uint32_t data_size,
    VCHI_FLAGS_T flags,
    void *bulk_handle)
{
    SHIM_SERVICE_T *service = (SHIM_SERVICE_T *)handle;
    ..
    status = vchiq_bulk_transmit(service->handle, data_src,
        data_size, bulk_handle, mode);
    ..
    return vchiq_status_to_vchi(status);
}
EXPORT_SYMBOL(vchi_bulk_queue_transmit);

There is a function to create VCHI_SERVICE_HANDLE_T

int32_t vchi_service_create(VCHI_INSTANCE_T instance_handle,
    SERVICE_CREATION_T *setup,
    VCHI_SERVICE_HANDLE_T *handle)
{
    VCHIQ_INSTANCE_T instance = (VCHIQ_INSTANCE_T)instance_handle;
    SHIM_SERVICE_T *service = service_alloc(instance, setup);

    *handle = (VCHI_SERVICE_HANDLE_T)service;
    ..
    return (service != NULL) ? 0 : -1;
}
EXPORT_SYMBOL(vchi_service_create);

But you need a VCHI_INSTANCE_T which can be initialized here

int32_t vchi_initialise(VCHI_INSTANCE_T *instance_handle)
{
    VCHIQ_INSTANCE_T instance;
    VCHIQ_STATUS_T status;

    status = vchiq_initialise(&instance);

    *instance_handle = (VCHI_INSTANCE_T)instance;

    return vchiq_status_to_vchi(status);
}
EXPORT_SYMBOL(vchi_initialise);
Khaled.K
  • 5,828
  • 1
  • 33
  • 51
0

I think openmax gives more performance in multimedia processing. You can compare performance for this two alternatives simple running respective pipelines for gstreamer. For this actions not needed programming and you can use gst-launch fot this purposes. Openmax plugins for gstreamer start with 'omx' prefix. Coding and decoding operations perfectly executing with omx while main CPU get no load. Proprietary implementation for h264 coding or decoding is very difficult problem and without using libraries you can spend for this many years.

Xuch
  • 11
  • 4
  • Xuch, thank you, I think you misunderstand the question, I do want to decode using VideoCore acceleration but without going through OpenMAX API layer, just going directly to lower layer, VCHIQ API. CPU load is not an issue. – Alex I Jan 11 '14 at 15:50