On Opensource Projects Support

April 22nd, 2016

Today I’d like to talk about how opensource projects are supported by “community” on example of FFmpeg/Libav.

Obviously I’ve chosen this example because I know some facts about internal politics and the fact libavcodec is de facto multimedia decoding standard used on every platform and by most multimedia processing tools out there.

What good did it bring to the project(s)? Some fame but that’s probably it—the largest users don’t even bother to acknowledge in public that they use it (cough, BaidUTube, cough). There’s an enormous amount of code (that serves as a good compiler suite too) but it’s maintained mostly by volunteers and people who have to use it at work (as they were hired because they worked on it in the first place). That’s it: the best material gain is an employment because you’ve showed off your skills or you can take occasional consulting work with varying quality of tasks and pay. Some people are paid to improve or write new decoder. Some are hired to work on improving protocol support and hardly paid at all (true story). Some simply sigh at yet another “please implement this for my app” mail. That’s good but where are the money to pay for task the project itself (e.g. refactor old horrible code, add tests, implement new feature or fix some old problem)?

There was an attempt to set up a foundation to gather money and use them for the project but it didn’t work well even before the split and got completely derailed after; and of course it was not a good idea to set it up in the USA since IRS refused to recognize it as non-profit organisation as other open-source projects have experienced (including X.org). The best part is how much money it could raise—I don’t remember the actual sum but it was relatively low like less than $20,000 (please correct me if I’m wrong) and it came mostly from caught (L)GPL violators IIRC.

Let’s take a successful opensource project that uses libavcodec, that would be VLC. For example, last VideoLAN Developer Days definitely costed tens and tens of thousands ${proper_currency} — for accommodating about hundred of people, reimbursing (at least some of) their travel costs, food etc etc; the event was sponsored by the largest Internet advertising company, the largest French advertising company and the largest French VideoLAN advertising company. I remember talks that they could even employ one developer full-time. And would VLC be any useful without libavcodec? And while they probably are the biggest opensource supporter of FFmpeg (they host their Git repository after all) and their developers write some code time from time they hardly do anything else—there is a bounty program but it’s a complete joke since it lists mostly done tasks nobody will claim reward for (and despite me pointing them to that fact nothing has been done). And obviously it doesn’t have tasks that would be beneficial for FFmpeg/Libav but not for VLC directly.

Let’s take a look at some commercial user. All those video hosting sites are available mostly because there’s enough bandwidth to stream video and because there is free code to support most of the formats uploaded by users so they don’t have to bother about details. And that free code is… (no points for guessing). And the biggest video hosting site was mentioned above. They don’t provide any support nor even acknowledge that they use certain opensource projects. Of course one can point to Baidu Summer of Code program but it’s for involving students into opensource (and probably finding fresh meat for themselves) and it doesn’t work that good for projects—you mostly have students willing to get money and/or credit for résumé. They tend to do the task and disappear completely. I’m sure that projects would prefer to have financial freedom instead to sponsor specific tasks that could be undertaken by anyone (not just students) and last whatever time is needed (not just summer). If the project gets some support from large companies it’s usually because a developer from that project working in that company convinced the management to do so.

Even worse is the situation with distributions because they tend to demand free technical support from you: fixing your own bugs, fixing other projects that use your code and such. What do you get in return? Nothing. The recent Debian and Xscreensaver messup is not some special example, it’s too typical.

It’s funny how the best sponsor for Libav might be Lu_minem.it—a small Italian company whose founder is Libav developer and thus knows what the project needs.

How do I fit into all of this? I was FFmpeg and then Libav developer, later a multimedia related work has found me because of my work on decoders (I’d been trying to find a good job myself but failed and had to accept the best offer I got). I was around core developers of the projects and thus could learn the facts written above. What have I got beside developing experience, acquaintances and pessimistic outlook on life? BSoC 2006-2009 participation (where I managed to finish only one project in time really despite the formal passing of it) plus some smaller sum of money for rewriting a component of swscale to make it LGPL. And some free dinners from VideoLAN foundation. So I’m fine but seeing that the project cannot afford paying a developer for some internal project (like writing a new RealMedia demuxer) is still very sad.

On multimedia player names

April 17th, 2016

Warning: if you do not recognize names mentioned here you might be too young.

I’ve been using computer for about nineteen years and during that time I tried various players for various formats. My curiosity for internal format design made me search for information about compression methods, source code for decoders and such. So it led me to the current state (doing nothing). Yet for the many multimedia players I know there are some naming issues and that’s what I want to talk about.

My first versatile player was PLAYSND.COM by Yuri Tumarin. This 13kB DOS program could play a lot of various sound formats like WAV, VOC, MIDI variants and Adlib tracker music (RAD, HSC and lots of other variants). The best part is that it played some compressed WAV files too (various ADPCM variants and more). Excellent tool but the name is too bland and hard to search for.

Speaking of Adlib tracker formats, there’s an opensource player with Adlib emulator supporting lots and lots of them. The problem with it (beside being outdated now)? It’s called adplug. A good name to be blocked by a generic rule!

Let’s move to video players.

My first Linux player for VideoCDs was MpegTV. It was a commercial program but again, it was a country where nobody bothered about piracy and it was the only player on Linux I knew that could decode VideoCDs without stuttering. The player was doing its task fine but its name is rather cringeworthy.

Then I found out about DVD-oriented players like Ogle and Xine. Good names. And I still use Xine sometime when I need to play DVDs.

And the golden standard of multimedia on Unix systems—XAnim. The only bad thing is that the last time I checked it didn’t work correctly in 32-bit X11 mode. But it did its job well and I’m still grateful it exists (and also its binary plugins were good binary specification for missing codecs).

And there’s MPlayer. It was fast, it had many useful features and codecs supported (I still use it as a testbed for running some 32-bit VfW/DMO codecs when I cannot write a decoder without debug) but its codebase was horrible (in some cases legendarily horrible) and the name is both bland and reminds of mplayer32.exe (which crashed and hanged a lot too).

And one of its forks is named after one of the horrible chunks in libavcodec and its author operates under pseudonym. So was it really worth it to name the player MPV?

And I conclude this review with a well-known multimedia player that I won’t use. When I think about VLC the first meaning coming to my mind is variable length codes. And when the only good thing about your player name is the number of puns you can make I’d use something with more decent name thank you very much.

TwilightMotion Saga: The End

April 17th, 2016

I’ve finally documented what I know about VP4 in the wiki and I should unload it from my memory. Implementing decoders and such is left as an exercise for TrueMotion-loving reader.

Probably I’ll look at ClearVideo (for the N-th time) or some speech codec suite. Funny thing is that even if they market it as a single speech codec you have a good chance to find several codecs for different bitrates (like for Lernout & Hauspie you have CELP for 4.8 kbps and SBC with different parameters for 8, 12 and 16 kbps) and don’t get me started on VoxWare MetaSpeech (don’t confuse it with MetaSound—that one is not a speech codec or with MetaAudio—that one doesn’t exist), that’s the rant for another day.

TwilightMotion Saga: Random pre-VP3 Bits

April 16th, 2016

TrueMotion 1 was licensed and has several variants outside the usual TM1. There’s allegedly Horizons PowerEZ but only j-b would know anything about it—because it’s vintage and used to code content he’s interested in of course. The other version was used for intro and victory cutscenes in Star Control II: Ur-Quan Masters 3DO version, the source code is available so any Mike Melanson out there can have a look at it. To me it looked as the same coding algorithm but with custom delta tables and codebooks provided. Oh, and data is split between several files (global header, codebook, frame data and offsets to individual frames).

TrueMotion 2 Realtime seems to be really Truemotion 1.2 Realtime Edition. It has quite similar header format to TrueMotion 1 (same obfuscation even) but with some values that would make TM1 decoder bail out on error and it was released before actual TrueMotion 2.

TrueMotion 2X seem to return to coding method from TM1 as well since there’s a suspicious similarity between its inverse Huffman coding method (they call it “string encoder” which sounds somewhat even more confusing) and the codebook used in TM1 except that in TM2X they use 0x80 as the end of data flag instead of 0x01.

P.S. I should really move to VP4 and then away from this codec family altogether.

A Quick Look on IMM4

April 10th, 2016

So I’ve spent an hour or so to look at IMM4.

What do you know, it’s a very simple IDCT codec with interframes. Intraframes have only DCT with usual run-level VLC coding, interframes have skip flag to tell whether this macroblock should be skipped or there’s a difference to the previous frame coded or intra block. See, no motion vectors, quantisation is single value per block (except for DC in intra block), there seems to be no zigzagging either. You cannot get much simpler than that.

TwilightMotion Saga 2X

April 9th, 2016

Okay, now it should be the last post about TM2X.

It’s hard to believe but looks like there were at least five versions of this codec that can be distinguished by the chunk ID where frame information is stored (I have decoder for versions 1-5 and all known samples are version 4). So in version 5 they’ve added coding of motion vectors for 8×8 blocks in various forms including quadtree (and that’s what confused me). Looks like there are tile dimensions stored in configuration chunk (0xA0000109) and codec operates on those.

Again, looks like decoder first calls a function to determine what to do with a row of blocks and then corresponding functions decoding (sub)block data. And I was confused by those too—some of the functions read luma and chroma, some functions read only chroma and some read luma, chroma and two other unidentified values of different types (so it’s not a motion vector). They always have 2 luma samples (if present) and 1/2/4/8 chroma samples. Or is it the other way round with two chroma samples and 1-8 luma samples?

What the Duck, On2, couldn’t you opensource TR20 and TM2X/TM2A along with TM1, TM2 and TM VP3 (and they were all in the same package, mind you)?

In any case I’ll try to forget it again, there’s still VP4 (aka AOM codec -5).

How the codecs should emerge (hint: without .ebuilds)

April 6th, 2016

So it has come to this, some events and discussions made me write this post.

How I imagine the perfect process for new codecs? It’s rather simple model: you have some places where ideas and enthusiasts swarm and from their work and selecting best ideas new candidate codec is born.

There are such places for all codec types: audio enthusiasts can find testers at Hydrogenaudio, video enthusiasts can talk at Doom9, general and image compression people seem to be present at encode.ru. In first approximation it works as expected—people propose ideas, test new compression programs and report benchmarks, suggest improvements. What can be wrong there? Just one thing: people making software incompatible with anything else (custom containers/archive formats) and trying to push it on everybody. After you invent some format make sure it works in some standard environment (for compressors it’s usually single file compression mode, .tar.xz seems to be more popular than .7z even if they use the same LZMA algorithm; for codecs it should be the standard container—even Matroska would do). And document the format too—properly instead of usual “bug off” level.

There are standardised codecs that undergo similar process: various companies or researchers submit their work, a base for a new standard is chosen, new proposals try to improve it. And then companies start to push their patented shit there and that’s where the system goes wrong (QMF in MPEG Audio Layer III anyone?). It’s not better when some company tries to push its product as a standard without any evaluation (and thus we get wonderful line of SMPTE VC-x codecs for instance).

And there’s OggXiph. This is again a community that designs codecs mostly because they can and pushes them mostly because they’re Free™ and OpenSource™ and they mostly suck otherwise: Ogg format is for streaming not good for anything, most people still don’t know that it’s Ogg/FLAC because it was developed outside (and has horrible raw stream format), Speex has no readable specification and easier understood with disassembling the library rather than reading source code, Theora is an outdated enterprise grade code, Opus has its issues (but it’s rather good, one cannot deny that), Daala will probably never happen.

And what do I see in recent news? Alliance for Open Media plans to release first draft of their codec soon and it is:

  • hosted on baidusource.com;
  • for now just libvpx with some names changes;
  • everything else about it screams Baidu too.

It if looks like Duck, produces codecs like Duck and has the same source code as Duck, then it probably is DuckOn2Baidu.

At least in the old times there was some competition of ideas in codecs so one could choose between different codecs giving good results—and in some cases they were available for various ecosystems too (e.g. Indeo was present in AVI and MOV, ClearVideo managed to get into AVI, MOV and RM). Now it’s just foam of lossless codecs that even their authors forget about next year and one or two companies pushing their stuff on everybody. And that makes me sad.

TM2X Woes

April 3rd, 2016

I don’t know what I should write about this codec.

TM2X (or TM2A, they are really identical) differs in design from TM2 Vanilla. The main principle seems to stay the same for TM2, TM2X and TM2RT — they all operate on delta coding from the previous delta and top neighbour. But while for TM2 it’s always 4×4 blocks, for TM2RT it’s the whole plane, for TM2X it seems to be variable block size (i.e. it can be 8×8 block or even larger). TM2 uses classical Huffman coded data (with tree description and such) one per each block type, TM2RT uses fixed size deltas (2-, 3- or 4-bit), TM2X uses inverse Huffman lists (i.e. each byte codes a list of values which you’re supposed to read sequentially). And for TM2 there was source code (horrible C but source code nevertheless), TM2RT had compact and rather sane binary specification, TM2X has only an insane binary specification. How insane? For starters, it uses obfuscation for some chunks that’s tedious to undo by hand (unlike TM2RT), it has internal design relying on calling on array of virtual functions and those seem to treat esp as “Eh, Structure Pointer” which will confuse any decompiler.

Thanks to that I was unable to reconstruct all the decoding logic but at least some facts seem to be more or less clear:

  • decoding seems to vary greatly depending on decoder configuration provided in corresponding chunks (since those values are used to build function pointer arrays);
  • there’s lots and lots of block decoding functions that read different amount of deltas per 8 or 16 pixels, e.g. there can be 3 or 5 deltas per 8 pixels;
  • all decoding functions use the same inverse Huffman list but there are different ways to remap its output: there are delta value mapping tables for luma and chroma, generic value decoding uses special escape value to signal that its decoding is not done yet etc;
  • motion compensation is indeed uses halfpel precision.

So I’ll probably just forget about this codec and move to VP4 and then forget about all these turkeyduck codecs. I fear that ClearVideo will be abandoned on the similar level too. Well, at least there’s a lot of speech codecs to talk about.

TrueMotion 2 RealTime

March 30th, 2016

I’ve been reminded that this variant of TrueMotion exists too. What do you know, it’s actually somewhat like TrueMotion 2 NoModifiers.

Essentially it’s just another fixed packing scheme like Creative YUV, Cirrus Logic CLJR or Aura. You have left prediction, deltas coded with nibbles, the usual stuff (at least blocks in TM2 were coded similarly). The only peculiar thing is that it codes data by planes with chroma planes being coded first.

I hope to add detailed description of this codec to Multimedia Wiki by the end of this week and then forget about it again.

OptimFROG

March 26th, 2016

You know, the greatest reverse engineer I know is Derek B. He’s managed to RE such codecs as Canopus HQX and Cineform HD in the most efficient manner ever—saying he’ll do it and patiently waiting until somebody else does it.

So here are some words about his favourite lossless audio codec. The most interesting thing about it is that it was actively developed in 2001-2006 and then it was suddenly resurrected in 2015. Also it’s one of few non-standard codecs (i.e. not made into standard) that has several articles written about it.

The codec actually consists of two different formats, seemingly an old one and a newer one (that looks like it supports all range of sample type). The former is notable for having signal reconstruction stage using floating point math (a thing you don’t see in codecs every day), the latter seems to employ various parameter reading and reconstruction methods. Coding is done using low precision range coder (large values are decoded using chunks of 8 or 12 bits). So nothing really interesting there.

P.S. I’m definitely not going to write a decoder for it. There are too many lossless audio codecs already, let all proprietary ones (in custom containers too) die in peace.