FFhistory: the most annoying format

Looks like the series was misunderstood by the public, especially by those who did not read the prologue and were disappointed by the conclusion. Oh well, I can still post random bits of FFhistory with some inconvenient truths even if nobody is going to read them.

There are many codecs and container formats that are annoying to support: “industrial” formats like MXF have their own Internet of documentation (i.e. lots of various documents referring to other documents, most of them are paywalled as well), other formats suffer from being too flexible or too bloated that it’s next to impossible to implement support for all possible features (e.g. JPEG-2000 or H.264 scalable and multi-view extensions). There are formats that are abused to death (MPEG-TS and MP3 come to mind), there are formats that are annoying to reverse engineer and there’s too little interest going there (you would not believe how much time it took to support Windows Media 3, from basic decoder to the interlaced mode support, something tells me we won’t see completed Bink 2 decoder any time soon either), there are formats that require writing an emulator for some system (like CGDI codec that recorded GDI commands), but there’s yet another candidate that I consider the most annoying one in the whole FFhistory since it annoyed the project in many different ways.

I’m talking about Macromedia/Ad*be Flash of course. The container format by itself is simple, but its popularity made it an annoyance, there were some annoying hacks for it as well and the codecs it supports were annoying in multiple ways as well.

For all you whippersnappers who hear about this format for the first time, here’s a bit of general history: back in mid-2000s when first video hosting platforms have appeared (two or three of them are still around, by the way), the only way to play video online was with some plugin (it was still before HTML5 and widespread MP4 adoption) and Flash was the most popular for a number of reasons. Beside offering interactivity it could also play video files in its own format called FLV. So when broadband Internet access got widespread and there was finally a market for streaming videos out there, Flash was a good candidate and the tools that could be used to encode content into FLV became popular as well… It was also powering a lot of annoying animated banners back in the day but that’s a different story.

Before going to listing what kind of annoyances it gave FFmpeg I’d like to give a very short technical overview of FLV, those details will come in handy soon. So, FLV by itself is a simple format which can potentially support up to 16 different video codecs (in reality only six were defined) and the same number of audio codecs (well, some tags are used to signal different flavours of the same codec so in reality it’s even less). Moving on!

The popularity of FLV was the first and obvious annoyance. I witnessed how #ffmpeg-devel channel on IRC was created exactly because too many clueless users came to the original #ffmpeg channel and asked the same question over and over again: “how do I recode X to FLV?”. Then there was an annoyance when some codecs in FLV were not yet supported so users constantly asked to support it or when it will be supported (more on those codecs later). And finally there were people who tried to abuse the unused codec tag values for their own needs, like adding another audio codec or several H.265 flavours. Actually there were two or three attempts (all by Chinese for some reason) to push H.265 support in FLV, one of them was even before the bitstream format was frozen.

But what about the codecs defined in the specification?

On2 VP6 was rather popular at the time, exactly because of its inclusion into Flash-supported codecs. So people asking for support were an annoyance, until somebody took Java applet from On2 site (it was used to showcase VP6 decoding there), reverse engineered it and put a decoder on SourceForge. Some time later Aurelien wrote a proper VP5 and VP6 decoder using that work as a starting point and FFmpeg finally got VP6 decoding support. On2 people tried to get it removed once but gave up (still, that was annoying). And finally, until there was H.264 support added to FLV, people came with requests about an opensource VP6 encoder (fun fact: IIRC On2 own solution was essentially using mencoder with their own encoder DLL—and they sold it for money).

Speaking about the codecs with the similar story, there is NellyM*ser ASAO. This is a speech codec that was reverse engineered by some people who preferred to leave MD5 hashes of their names for copyright. Initially it was dumped as a stand-alone project which was then adapted into FFmpeg decoder (and encoder later as well, based on a work by some other people). Unlike the situation with VP6, employees of that company tried to get support for that codec removed from FFmpeg at least three times, both by public requests and via private talks with the admins. And unlike On2 folks, who were more worried about plagiarism of the source code, those people did not like the fact that an opensource program can decode and encode their format but used laughable excuses (like trademark violation IIRC) to get it removed.

There’s also Flash Screen Video 2 codec which was annoying to reverse engineer. Back in those days the Multimedia Mike worked at Ad*be on Flash player and even helped editing the SWF/FLV specification, so I asked him about that codec. He replied that it’s easier to look at the compiled code to figure out what it does (and hearing from other people who had the misfortune to work with Flash codebase, he’s absolutely correct). Eventually somebody had produced an opensource encoder for this format and from it a decoder was devised.

And finally there’s Speex, the only codec in Flash that lacks a native support. Yes, all other codecs, standard or proprietary, have been implemented inside libavcodec framework—but not Speex. This one despite being opensource has only one implementation (source translations into similar programming language do not count). It has been supplanted by Opus, another codec also from Xiph that is also insanely hard to implement without resorting to the source code translation (people who have attempted to do that usually implement Silk decoder and fail at CELT part; the former is contributed by Skype and the latter is from Xiph—see the pattern?). Having to resort to an external library for an old and (presumably) not particularly complex codec is annoying. Implementing it yourself would be even worse. Update. It turns out that a native Speex decoder has been added to FFmpeg about a year and a half ago which is a bit too late, isn’t it?

I said in the beginning that there are annoying formats but as you can see nothing can beat Flash by the sheer variety of the annoyances it produced.

P.S. And there’s related RTMP* family of streaming protocols which is essentially sending Flash bytecode messages over the network (some of them happen to contain audio and video packets). It has its own share of annoyances because of quirks in different implementations but let’s stop here.

6 Responses to “FFhistory: the most annoying format”

  1. Paul says:

    Speex is native in libavcodec, but whatever do as you please.

  2. Kostya says:

    I missed that, thanks for pointing it out.

  3. Peter says:

    What about the least annoying format? I vote for Sun AU.

  4. Kostya says:

    If the format is not annoying it’s hard to remember about it. Your candidate is as good as some others (which I have troubles thinking of for the reason stated above).

  5. Daemon404 says:

    You’re aware a consortium of mega-corps has banded together now to resurrect the corpse of FLV and RTMP (via its new spec owner)?

    The changes are actually even more half assed and poorly thought out than you’re imagining, too… there is even comments like ‘// Used once in a fork of FFmpeg’ or somesuch.

  6. Kostya says:

    You can be envious of me, but I have not heard about any such protocols being developed. Actually after I’d made a working RTMP receiver implementation during SoC I didn’t have to care about any streaming protocol really. At least you enjoy such eldritch things.