April 13th, 2014

Finally I’ve found some time to play with i.MX6-based Utilite which I intended to use as a home box for various stuff (like running fetchmail, irssi, simple web server etc. — in other words not desktop). So here’s a quick review:

  • does not work with my display (1920×1200, DVI input)
  • does not allow logging in via SSH (it refuses passwords) and the same problem with sudo later
  • does not have IPv6 enabled (not a grave problem but my provider has moved to IPv6 already)
  • serial port works as good as telegraph in magnetic storm (honestly, it gives all type of characters on terminal except the ones you can read let alone want, typing one character per minute is somewhat better)

I might be really old but this is not a development board (at least it’s positioned by desktop) so I expect it to work. And unlike previous product by the same company one cannot blame it on hardware — it’s i.MX6, not Tegra2.

So I’ve ordered Cubietruck already (I have Cubieboard2 at work and it has been running fine right from the start).

P.S. Raspberry Pi can go to hell.

A Bit More about UnclearVideo

April 12th, 2014

So, in the course of cleaning-up program I’ve looked at ClearVideo again.

Intraframes decoding has been REd long time ago—it’s merely DCT-coded blocks. They happen quite often and it’s possible to watch it somehow.

Interframes coding is what makes this codec unique. And if you expected real fractal coding then you’re wrong (and that’s expected—processing power needed to restore even 320×240 frame from seed by applying iterations might be too much even for today). Instead you have quadtree with square blocks. Do you know what information it stores there? Subtree flags (i.e. if you need to divide it further), motion vector and brightness control. So the block is copied and its brightness might be adjusted too, that’s all. There’s no residue coding. And tree information can be coded in four different ways (I’ve seen four out of about six different tree decoding functions being employed), properties are coded with variable-length codes that I fear may be autogenerated.

So what prevents me from finishing it? A very horrible design. It’s written in Very Objective C++ dialect. Every piece of code is wrapped into its own class with possible overloading, so instead of calling a function you retrieve class pointer from some subclass of current context and invoke function from vtable there. Which will not be the real function but rather a thunk that will jump to the real function (even better, some functions calling themselves do it through that thunk too). And yet they use global variables! This is impossible to analyse with static analysis (like I mostly do), it’s nightmare for debugger as well (but a hack to MPlayer loader that displayed calls and indirect jumps was very useful here) and I suspect it won’t get much better even if I get original source code (which I’ll never get of course).

A small example of design: tree decoding uses several GetBitContexts stored in more generic TreeDecodingContext, which is a subclass to FrameType2 context, which is a subclass of DecoderContext. And I’m pretty sure I forgot some levels of indirections in-between.

Time to give up? Time to give up.

A Bit More on Security

March 27th, 2014

This is a translation of this page by unknown author. It’s rather old but recently I remembered it for some reason and decided to share.

Day One

A hacker comes to a diner and sees that salt shaker can be opened by anyone and anything can be put inside. The hacker comes home and writes a letter to the diner manager: “I, meG@Duc, have found a salt shaker vulnerability in your diner. A malicious person can open it and put poison there! Please fix it!”

Day Two

Diner manager gets that mail along with other correspondence and shrugs: “What an imagination”

Day Five

The hacker comes to diner and puts some poison into every salt shaker. Three hundred people are dead, a criminal case against manager is closed after three months because there was no crime from his side. The hacker writes a letter “see now?”.

Day Ninety Six

Manager orders special salt shakers with a combination lock. Diner guests feel that they don’t get something.

Day Ninety Seven

Hacker discovers that holes in salt shaker pass salt in both direction (and other substances too). He writes a letter and pisses into ever salt shaker. Three hundred people don’t come there ever again, thirty people went to hospital with poisoning. Hacker sends manager a SMS “How d’ya like it?”. Manager spends three months being interrogated and a year on probation.

Day 188

Manager vows never to work at any diner ever again and be a lumberjack instead. Engineers are working on one-way salt shaker design. Meanwhile waiters remove all old salt shakers and give salt on demand.

Day 190

The hacker steals a salt shaker and researches it at home. Then he writes a new letter to the manager: “I, meG@Duc, stole a salt shaker and find this outraging! Anyone can steal a salt shaker from your diner!” So far abstinent manager goes home and drinks vodka.

Day 193

The hacker discovers that all salt shakers are chained to the tables. He talks about his achievements at the next hacker conference and receives an award for protecting society and customers’ needs. At least manages doesn’t find this out.

Day 194

All hackers from the conference make a devious plan. They go to the diner and take all salt from shakers. meG@Duc then writes another complaint about low customer service and that anyone can deprive everyone else of salt.

Thus a new salt shaker design is needed. Engineers are working on it while waiters still give salt on demand. Manages goes abroad and uses room service only — no cafes, bars or restaurants.

Day 200

Customers discover that in order to get salt they have to call waiter, show their ID and get special 8-digit one-time code for a salt shaker. Repeat the same for pepper.

All Containers Suck

March 25th, 2014

It’s pretty obvious but I got requests to write this nevertheless.

All known containers suck, some of them suck gloriously, some of them plainly suck. And there’s Ogg Matroska Ogg.

There are several features that distinguish container usefulness:

  • flexibility (supporting various codecs and number of streams);
  • easy to parse;
  • well-defined specification (there must be a format with such thing);
  • metadata support;
  • low overhead (bytes needed to define frame size and other properties);
  • advanced features for insane people.

Now let’s review containers grouped by design.

Raw or raw with header. Those are the simplest and codec-specific. Besides being designed (usually) for only one stream and one codec, they often decide to save bits on frames and in result you have hard time implementing seeking (say hello to FLAC or Moosepack SV7). Some have seek table at least (old Monkey’s Audio has two — for byte and bit position).

Your favourite FLV belongs to this category — it has one audio and one video stream with no headers (and that’s why it has its own flavour of VP6 with frame dimensions stored at every frame) though one can abuse it to add a data stream. And of course some Chinese used it to store HEVC too in the stupidest way possible (for starters they have introduced half a dozen of different video codec IDs for it).

Chunk-based. The most popular category that refuses to go away. The best representative is RIFF (M$ ripoff of EA IFF format, there are many specific RIFF variants known — AVI, RMF, WAV, WebP) and runner-up is MOV/MP4. AVI is verily the pinnacle — flexible, extensible, every frame is its own chunk. What can go wrong with it? The usual thing: abuse. Too many idiots implemented their own AVI writers with whatever bugs they could introduce and it got even worse when codecs started to employ B-frames. Intel worked around by adding combined I+B-frame and dummy frame afterwards so decoder would handle it internally (you can see it both in Indeo 4 and their I.263). DiVX on the other tentacle… And variable framerate is not for AVI either (unless you simply use zero frames to define skips).

As for MOV/MP4 there seems to be a problem with parsing custom atoms (there are too many atom types around). And of course you have nice abuse like ASF packets stored inside MOV packets if you use Flip4Mac.

And if you replace chunks with an unholy mix of tags and UIDs you get MXF. That format doesn’t have a specification but rather a swarm of them so you don’t know which ones you’ll need to demux some file.

There’s NUT — probably the only format out there with two specifications and three or four implementations, each not agreeing with all other.

MPEG-TS inspired. MPEG-TS is one of overengineered container formats that nothing in this world would be able to demux a TS file with all possible features. And forget about seeking (unless you have an external index or build index yourself).

Of course such design inspired a lot of other formats that have some features of it but often those features are used without understanding why they are there. But result is good for streaming!!!1one

There’s ASF with crazy GUIDs for everything and fixed packet size (which means there’s no direct correspondence between ASF packet and stream packet anymore).

And there’s Ogg. Read this if you still haven’t.

Matroska. That’s a cancer — when you design container that should be able to contain everything and support any feature possible and it gets out of control you get Matroska. It’s based on binary XML, it can have any feature. And it stores every codec in its unique way — see what they call codec specs. So they save bytes here and there and demuxer should put them back, which is not nice, especially if you believe that demuxers and decoders do not need to know about each other.

If you wonder why I haven’t mentioned RealMedia, it’s because this format is an unholy mix of all categories:

  • Old RealAudio is rather simple raw + header;
  • RealMedia in general is chunk-based format (with a hack for B-frames even).
  • Video frame can be split into several packets or several frames can be merged into single packet, a lot like MPEG-TS inspired formats.
  • And they had mangled audio streams long before Matroska was here. Actually only some audio codecs data is stored as is, the rest is XORed or has permuted subpackets.

How Projects Wither and Die

March 22nd, 2014

For the last few years I feel some disappointment with my work building up and now I try to explain why.

What kept me working on FFmpeg and later Libav?

Money? Well, I admit that it brought me ~$20000 during all those years of work and it was very helpful in my student years but it’s not that much really and wouldn’t be enough for living even in Ukraine.

Of course the main driving reason was fun of writing code and joy of being useful. I still remember being proud for a week for this commit. I still remember how it was fun (sometimes) to reverse engineer a codec and warm feeling when it’s done. I remember users thanking for the work done and asking for features.

Where did all go?

The project matured and now the situation got different. Previously you mostly had millions of clueless users asking how to transcode something to FLV that were tiring but easy to deal with, now you have more enterprise users that use our code often without acknowledging or contributing back (in the old times Picsearch gave us a database of audio and video files in Internet that used codecs we didn’t support — that’s one of the most valuable contributions ever). But that’s not what kills the fun, “security holes” do.

With an advance of automatic fuzz tools it’s easy to generate millions of damaged files that crash your decoder and yet there are no tools for generating correct patches. Fixing those crashes is tedious, requires a lot of thinking (should I disable it? will it affect decoding correct files? etc.) and in other words not fun at all. You have to balance between having decent code, ability to handle corrupted files and being robust — and in order to account for all possible corner cases in the code from the very beginning you should be more paranoid than FFmpeg leader. And somehow you cannot avoid it, you’re expected to fix it or else. This is like you’re on maintenance contract but without any form of compensation, you just get a mountain of corrupt samples and “have you fixed it yet?” every week. Or you get some “security vulnerability” reports with the same effect. I repeat, this is not fun — so why should I do it for free?

There is only one exception around called VideoLAN. Those guys really show (and not only show) some care and they give back to us. Just in my case I gave them all they wanted and I could give them.

As for the rest, world domination is not my goal, I don’t see fun in maintenance and noone is paying me to do it. Why should I continue?

So I’ll try to finish whatever projects I still have around and end it all. I’ve been around for 9.5 years after all, that’s long enough.

P.S. Maybe I should move to Oljonsbyn.

A Glance at Mobiclip HD

January 27th, 2014

For no particular reason I’ve decided to look at it and the codec is quite interesting despite being of somewhat WTFy design.

First, it uses 6 buffers so motion compensation can reference any of them while decoding (except that buffer 0 is the current frame and it’s a skip block).

Second, it seems to operate on 16×16 macroblocks that can be split further into smaller subblocks or encoded as the whole. Yes, it’s essentially a quadtree. And motion coefficients seem to be read in the beginning for all macroblocks and then deltas from overall macroblock value are coded for all subblocks when they occur.

Third, it seems to have halfpel motion compensation and some very simple transform that depends on block size and number of coefficients coded. While it’s no DCT I’ve seen that it decodes (last, skip, level) triplets, unquantises them and calls transform function depending on the transform size and position of last nonzero coefficient. And I’m not completely sure but looks like some kind of spatial prediction like H.264 is invoked too.

Truly there are interesting codecs no-one cares about (including me).

Some Notes on Some Intermediate Codec Family

January 27th, 2014

A friend of mine Mario asked me to look at DNxHD 444. It turned out to be quite easy to support in libavcodec decoder (at least for CID 1256 for which I has sample) after I looked at the binary decoder. And I was curious what formats were there.

Here is the list of internal IDs supported by Avid decoder with a family they belong to, image parameters (width x height @ bitdepth) and other properties.

  • 1233: Avid_HD (1920×1080@10) interlaced (marked as debug format)
  • 1234: Avid_HD (1920×1080@10) interlaced (marked as debug format)
  • 1235: Avid_HD (1920×1080@10) progressive
  • 1236: Avid_HD (1920×1080@10) progressive (marked as debug format)
  • 1237: Avid_HD (1920×1080@8) progressive
  • 1238: Avid_HD (1920×1080@8) progressive
  • 1239: Avid_HD (1920×1080@8) interlaced (marked as debug format)
  • 1240: Avid_HD (1920×1080@8) interlaced (marked as debug format)
  • 1241: Avid_HD (1920×1080@10) interlaced
  • 1242: Avid_HD (1920×1080@8) interlaced
  • 1243: Avid_HD (1920×1080@8) interlaced
  • 1244: Avid_HD (1440×1080@8) interlaced
  • 1250: Avid_HD (1280×720@10) progressive
  • 1251: Avid_HD (1280×720@8) progressive
  • 1252: Avid_HD (1280×720@8) progressive
  • 1253: Avid_HD (1920×1080@8) progressive
  • 1254: Avid_HD (1920×1080@8) interlaced
  • 1256: DNx444 (1920×1080@10) progressive
  • 1257: DNx444 (1920×1080@10) interlaced
  • 1258: DNx100 (960×720@8) progressive
  • 1259: DNx100 (1440×1080@8) progressive
  • 1260: DNx100 (1440×1080@8) interlaced
  • 32768: AHD-DBG-1 Avid_HD (64×32@8) interlaced
  • 32769: AHD-DBG-2 Avid_HD (128×128@8) interlaced
  • 32770: AHD-DBG-3 Avid_HD (480×320@8) interlaced
  • 32771: AHD-DBG-4 Avid_HD (64×32@10) interlaced
  • 32772: AHD-DBG-5 Avid_HD (128×128@10) interlaced
  • 32773: AHD-DBG-6 Avid_HD (480×320@10) interlaced
  • 36864: AHD-DBG-3 Avid_HD (720×512@8) interlaced

If you look at this table you can see more formats than supported by libavcodec currently. Unsupported formats being debug ones, interlaced ones and not belonging to Avid_HD family.

While I fully approve not having interlaced formats support, the rest can be supported (especially if samples are provided).

Sigh, too much intermediate codecs I had looked at.

Bink2: Inter Block Residue

January 18th, 2014

Inter block residue decoding is not different from intra block decoding except that DCs are expected to be in -1023…1023 range instead of 0…2047 and quantisation matrix for luma is different.

Posts about reconstruction process might follow.

Bink2: Intra Block — Chroma

January 18th, 2014

Chroma block coding is similar to luma but with some changes since there are only 4 blocks coded here.

Thus, CBP is coded as two nibbles (real CBP and VLC switch) and it does not try to reuse nibbles from last CBP in code.

There are only 4 DCs here but they are decoded the same way.

AC block decoding is completely the same.

Bink2: Intra Block — Luma

January 18th, 2014

Intra luma block in Bink2 contains the following elements: CBP, quantiser, DCs and ACs.

CBP is coded as 32-bit bitmask depending on the previous CBP value. Internally top half is coded depending on bottom one and the whole bitmask is coded in nibbles starting from LSB.
Lower half decoding depends on the control bits:

  • 11 — simply return last CBP
  • 10 — use low 16 bit from last CBP
  • 0 — decode 4 low nibbles of CBP. Initial nibble value is set to (last_cbp >> 4) & 0xF, if the next bit is 1 then don’t change it, otherwise read new value from the bitstream (4 bits of course).

Now we can use these low 16 bits of CBP to restore high 16 bits. This is also done by nibbles and decoding depends on them (why nibbles? Because blocks are coded in groups of four).

pat4 = (last_cbp >> 20) & 0xF;
ref = cbp;
for (i = 0; i < 4; i++) {
  if (!ones_count[ref & 0xF]) {
   pat4 = 0;
  } else if (ones_count[ref & 0xF] || getbit()) {
   pat4 = 0;
   for (bit = 1; bit < 0x10; bit <<= 1)
    if ((ref & bit) && getbit())
     pat4 |= bit;
  } else {
   pat4 &= ref & 0xF;
  cbp |= pat4 << 16 + i * 4;
  ref >>= 4

Essentially it decides what set bits to copy from the first part. And top 16 bits are not really a coded block pattern, it just tells decoder to use an alternative set of VLC codes in AC decoding.

Quantiser is coded with static VLC (plus sign bit for nonzero value) as a difference to the previous quantiser.

Quantisation table for DC: 4 4 4 4 4 6 7 8 10 12 16 24 32 48 64 128

16 DCs (coded with the same scheme as motion vector described in the previous post)

16 blocks of AC coefficients coded in groups of four. Each AC block is coded as (value, skip) pair where value is coded with static VLC that gives small levels (0-3) or number of bits for raw value to read. Skips have one peculiarity too: value coded with static VLC defines either skip (for values 0-10), escape value (when you got you need to read 6 bits with real skip value), end of block value and that the following 8 AC coefficients won’t have skip values coded after them.

Scan order is strange, here are first 8 indices from it: 0, 2, 1, 8, 9, 17, 10, 16, 24, 3, 18, 25, 32, 11, 33