Some Thought on Future FFmpeg Audio API « Kostya's Boring Codec World

Some Thought on Future FFmpeg Audio API

After some discussions on IRC I’ve participated I’d like to present here for future discussion.

Audio API should reflect video API as much as possible. Now decoder outputs 16-bit native-endian audio into raw buffer.
Introduce audio formats. I’d like to be able to decode old 8-bit codec into bytes, newer 24-bit audio into 32-bit ints, floats for other codecs if they need it, etc.
Planar format for multichannel codecs. It will simplify downmixing and channel reordering. (This is not my idea but it is worth mentioning)
Swscaler-like structure for format handling and negotiations between audio filters.
Block-based audio processing. Each audio should be operated as a multiple of blocks with fixed number of samples (like video is operated by frames and rarely by slices). Why not always by single block? Because some formats throw chunks with multiple blocks to decode (Monkey Audio, Musepack SV8) and some have too small blocks that cause too much overhead to process them by one at time (most speech codecs and (AD)PCM). This is just a bit stricter than current scheme.

Now, who wants to implement this?

This entry was posted on Thursday, November 22nd, 2007 at 12:57 pm and is filed under CEmpeg. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

3 Responses to “Some Thought on Future FFmpeg Audio API”

Multimedia Mike says:

November 23, 2007 at 5:55 pm

Can you perhaps give an overview on what a successful implementation would entail?
Kostya says:

November 24, 2007 at 11:44 am

Well, mostly it will requre new structure AVAudioFrame a la AVFrame, where channel data, downmixing coefficients, channel positions, maybe number of blocks.
Common API which will take those AVFrames and process block-by-block, changing format, up/downmixing and resampling (like swscaler does).

It’s important to preserve 16-bit short mono/stereo passthrough though.
Libsndfile Survey; CAFF | Breaking Eggs And Making Omelettes says:

September 19, 2009 at 12:55 am

[…] multichannel (more-than-stereo) audio very robustly in its present incarnation. Yeah, that’s another item on the TODO list. Check out the complete specs for CAFF, however. I think if we made it a goal to support CAFF to […]