FFhistory: audio « Kostya's Boring Codec World

FFhistory: audio

Today I’d like to talk about people responsible for the specific audio components. For example, FFmpeg had AC-3 encoder right from the start but the decoding had to be done via third-party GPLed library so somebody had to write a native decoder. Or who is responsible for the majority of speech codecs support in libavcodec? And who has made a terrible audio encoder that was still in use by a major video hosting (not me)? Read on to find out.

As I said in the beginning, initially FFmpeg had AC-3 encoder but no native decoder. Decoding had to be done via GPLed liba52 whose author(s) was/were against relicensing the code (for the same reason libswscale which used YUV2RGB code from their other project libmpeg2 had to stay GPLed until the component was rewritten from scratch). So who wrote the native decoder? Justin Ruggles did (based on some Summer of Code students’ work) and extended it later to support E-AC-3 as well (also with some SoC student help). He also created probably the best opensource AC-3 encoder called aften and used some ideas from it to enhance the performance of libavcodec encoder. Another significant work of his is FLAC encoder (and improving FLAC decoder). And he went again to create probably the best opensource FLAC encoder called flake. Let’s not forget about his libavresample for libav (I told its story before).

I’ve mentioned Dénes Balatoni (the author of native Vorbis decoder) in a previous post but did you know there was somebody who went and created an encoder as well? It was an Israeli named Oded Shimon. He came from MPlayer and contributed some things, mostly this and some parts of AAC decoder. It was more of an exercise and his encoder was not as good as the standard libvorbis let alone the one with aoTuV tunings. And yet by some oversight it was used in production by the big Y at some time for a while.

Speaking about AAC decoder, we should not forget Alex Converse. While the initial AAC decoder is a work of many people, he’s the one who brought it into shape and added many important features like support for SBR or ELD. He also tried to improve AAC bitstream writer to make it into a proper encoder but failed (not as bad as the person responsible for its creation but his effort could not improve it enough either).

And I should also mention Vladimir Voroshilov. He may be not responsible for some immediately recognizable decoders like AC-3 or AAC but he is one of very few people who understand speech codecs. So his legacy is not merely G.729 and Sipro ACELP decoders but also a whole framework of various DSP routines used by all speech codecs (AMR-NB/WB, QCELP, WMA Voice and so on). Also compare Sipro ACELP decoder with e.g. TrueSpeech one—while they both were reverse engineered, in the former all parts have clear names while the latter leaves an impression it was REd by somebody who had no idea how speech codecs work (and you would be right on that).

Nowadays only the names of Justin and Alex are remembered, I suppose. Other people have been inactive for too long but somebody has to make sure they’re not completely forgotten.

This entry was posted on Wednesday, January 11th, 2023 at 11:31 am and is filed under FFhistory. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.