audio sample format s16p, ffmpeg or audio codec bug?

Question

I have a video file and I had dumped the video info to a txt file with ffmpeg nearly 3 year ago.

...
Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16, 256 kb/s
Stream #0:2[0x1c1]: Audio: mp2, 48000 Hz, stereo, s16, 256 kb/s

...
Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16p, 256 kb/s
Stream #0:2[0x1c1]: Audio: mp2, 48000 Hz, stereo, s16p, 256 kb/s

With the same video, its sample format changes to s16p.

I implemented a simple video player which uses ffmpeg. It can play video 3 years ago, but failed to output the correct pcm stream after changing to update ffmpeg. I spent lots time and finally found that the audio should have been s16 instead of s16p. The decoded audio stream works after I added the line before calling avcodec_decode_audio4,

audio_codec_ctx->sample_fmt = AV_SAMPLE_FMT_S16

but it is just a hack. Does anyone encounter this issue? How to make ffmpeg work correctly? Any hint is appreciated. Thanks!

I know the difference between s16 and s16p. My question is about ffmpeg output different audio info with old and new version. My test video is s16, but new ffmpeg says it is s16p. — Arton, Feb 05 '16 at 15:26
The ffplay can play the video well, so I think the interface maybe change much, I will trace ffplay to find out the root cause. — Arton, Feb 05 '16 at 17:31

score 8 · Accepted Answer · answered Feb 05 '16 at 18:42

The output format changed. The reason for this is fairly convoluted and technical, but let me try explaining it anyway.

Most audio codecs are structured such that the output of each channel is best reconstructed individually, and the merging of channels (interleaving of a "left" and "right" buffer into an array of samples ordered left0 right0 left1 right1 [etc]) happens at the very end. You can probably imagine that if the encoder wants to deinterleave again, then transcoding of audio involves two redundant operations (interleaving/deinterleaving). Therefore, all decoders where it makes sense were switched to output planar audio (so s16 changed to s16p, where p means planar), where each channel is its own buffer.

So: nowadays, interleaving is done using a resampling library (libswresample) after decoding instead of as an integral part of decoding, and only if the user explicitly wants to do so, rather than automatically/always.

You can indeed set the request sample format to S16 to force decoding to s16 instead of s16p. Consider this a compatibility hack that will at some point be removed for the few decoders for which it does work, and also one that will not work for new decoders. Instead, consider adding libswresample support to your application to convert between whatever is the native output format of the decoder, and the format you want to use for further data processing (e.g. playback using sound card).

But why does the current ffprobe not correctly report the packing scheme used in the old file? — Gyan, Feb 05 '16 at 19:15
The packing doesn't depend on the file (or codec, e.g. mp2), it depends on the decoder (implementation). Current ffprobe uses current ffmpeg mp2 decoder, which natively outputs s16p for all files. Old ffprobe uses old ffmpeg mp2 decoder, which natively outputs s16 for all files. — Ronald S. Bultje, Feb 05 '16 at 19:41
This helps me a lot! The description is very clear. Thank you very much! — Arton, Feb 05 '16 at 23:55
this man blessed us with his knowledge, very good information — cs guy, Nov 20 '20 at 21:16

audio sample format s16p, ffmpeg or audio codec bug?

1 Answers1