The problem with opensource encoders « Kostya's Boring Codec World

The problem with opensource encoders

Disclaimer: this post is about the general situation with existing (and even more, with non-existing) opensource encoders (for both audio and video) and not about the flaws in those encoders.

When I was developing my toy(ish) VP6 encoder, I got questions about it and general encoding technologies from many people (as in “one, two, many” but still it’s above the expected amount of zero). And then I remembered the reasons why there was no opensource VP6 encoder before I wrote one.

The main problem with opensource encoders is the shortage of talented people and the lack of environment to grow more of them. As the result, those who know how to write or tune encoders keep doing that or move to some other stuff (nowadays most of them who are remaining active seem to be sucked into rav1e and those who don’t know how to write encoders have very hard time learning how it should be done.

Here is a list of people and projects I can remember that are not coming from a company (so e.g. faac, faad and x265 are out) and are for a complex format and with decent quality (e.g. Paul Mahol wrote SMC encoder but it simply encodes data and you can’t affect quality/bitrate; hopefully he’ll be able to do something more complex after Bink2 decoder). Oh, and various lossless codecs do not belong here because they’re dime a dozen (with a couple of notable exceptions);

first of all, there’s Monty with his Ogg Vorbis format (which became excellent after aoTuV psychoacoustic tuning), Theora encoder was rather good encoder for rather bad format and anything else was either forgotten (Ogg Squish and Ogg Tarkin) or has not materialised (remember Ghost codec?). And I have no idea what he does nowadays;
there are other people and projects associated with Xiph, I guess most of them are working on rav1e now that Opus is done;
then there’s Michael Niedermayer with a family of H.263-based encoders (based on work of Fabrice Bellard of course), Snow (see Theora description above, except this one is based on wavelets) and FFV1 (a really good lossless video codec). IMO he has not done any prominent work after H.264 decoder even if he remains active in the project;
there’s Justin Ruggles who wrote the best FLAC encoder and the best opensource AC3 encoder (and unlike my stuff, it’s not the best because there’s no competition). I have no idea what he does nowadays;
there’s Rostislav Pehlivanov who wrote AAC and Opus decoders for FFmpeg (whatever was before deserves a name of AAC bitstream writer). After a while he left the project for a scientific career IIRC;
there’s XviD project which still gets updates occasionally but I know only Peter Ross (who delivers other stuff about as often but it’s worth waiting for);
there’s LAME which seems to have reached maturity years ago;
there’s x264 which is kept alive but its main developers have left for something else;
it’s hard to say anything about exhale since its author seems to work at Fraunhofer on that kind of stuff;
from what I heard the original MPEG+ author left the project for something else (and MPEG+ SV8 did not become popular);
WavPack author David Bryant is still working on improving it and adding more features (like a new interesting way to pack DSD).

I’ve likely missed some people and projects but the pattern seems clear to me: people come, work on it and either leave it for completely different activity or the life takes over and the project keeps existing but not progressing, there are very few exceptions to that. And the sad thing is there are not enough people to replace them. Why? Because there is no environment for those people to rise from and that is caused by the lack of accessible knowledge.

Why people are writing their own general-purpose compressors and lossless codecs all the time? Because the information on how to do that is easily accessible and if you want to do that you can easily find a list of e.g. general compression methods with guides on how to implement them and pointers on optimisation tricks—after all I was able to do that back in XXth century when search engines had about as many good results (but no ads) and using batch mode for browsing (coming to a cybercafe for an hour once per week with a sheet of paper with URLs to check written on it).

Now consider my failed attempt to write an AAC encoder. I had good idea on how AAC bitstream is structured, I had some ideas on how the audio compression works in this case in general, there was an annex in the standard describing how encoding is done with the reference encoder, but that’s all. And it turned out to be not enough to write something decent (even with some papers from mp3-tech.org and help from Gabriel Bouvigne himself).

And what if I want to write a modern video encoder? There is no good book or article to start from (or am I missing something?). The books on the topic I could find were mostly a collection of methods for different parts of the codec (like motion estimation or transforms) but did not describe how it all should work together. What I could do so far with VP6 and VP7 encoder is based on the bits I picked up during my work on various decoders, papers on some stuff I knew that encoders use (like particular motion estimation methods) and the stuff referenced from there. I guess you can learn much more from studying the reference encoders and accompanying materials (MPEG or ITU membership may be required though) as well as the referenced papers.

Nevertheless, I consider having a book or a paper describing the design of sufficiently advanced encoder as a whole with brief explanations on how and why some parts should be done beneficial for everybody: people who want to learn more would at least know where to start, then they could either help existing projects beside fixing typos or go and create their own formats (who knows, maybe we’ll have something competitive then). Currently I feel most of such people are frustrated by the entrance barrier so the only people who get inside are those with too much determination and free time on their hands.

Sadly I don’t expect rav1e developers (as supposedly the most competent people in this area) to write such guide (let alone x264 developers coming back to do that). With audio encoders I don’t even know who would be a good candidate for such work.

And that’s a part of the problem.

This entry was posted on Sunday, February 20th, 2022 at 1:50 pm and is filed under Useless Rants. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

3 Responses to “The problem with opensource encoders”

Paul says:

February 20, 2022 at 2:27 pm

Interesting that you failed to mention my CineForm HD encoder.

BTW I’m busy fixing bugs in decoders like wavpack 32 int depth support, etc. Sloppy programmers make mistakes always.

SMC is toy encoder, RLE one from mighty company you are afraid of.
I’m not so much into writing more useful encoders beside toy ones.
Kostya says:

February 20, 2022 at 2:41 pm

You do a lot of stuff, I mentioned the recent one that I still remember (and yes, CFHD encoder is a better example but still it’s still not advanced enough).

And I write only toy encoders myself (partly because I don’t know how to write a proper one, partly because I don’t care).
Attila says:

February 21, 2022 at 12:49 pm

I think your blog posts, however ranty and lightly worded they might be sometimes, are excellent pieces to help even out that entry barrier into some kind of nicer slope. 🙂
Not to mention NihAV itself! So, thanks for all this, again.