On H.264 Coding Schemes Names

Continuing the theme set by the previous post, let’s talk more about confusing names introduced by H.264. I mean CAVLC and CABAC.

CAVLC stands for Context-based Adaptive Variable Length Coding. While technically true because it employs variable-length codes and the code set is selected based on context it’s nothing special (and I’ve not spotted anything there that would make it “adaptive”). Again, it’s a trivial thing less exercised before because they had less ROM for codebooks. The idea of “let’s select codebook depending on top and/or left decoded values” it too trivial to get an own name IMO.

CABAC stands for Context-based Adaptive Binary Arithmetic Coding and the name is partly stupid and partly misleading. But before I explain why I want to present some history and terminology.

Arithmetic coding was developed in late sixties to early seventies but mostly known by work of Rissanen and Langdon that resulted in many IBM patents. The idea is that you can assign probabilities to various symbols, send them to the coder and the coding result is a long fraction belonging to the range obtained by multiplying the ranges in sequence. I.e. if we have probabilities for A, B and C as ranges [0; 1/3), [1/3; 2/3) and [2/3; 1) then AB is coded in [1/9; 2/9) range and BA is coded in [1/3; 4/9) range. It’s the ideal coding method since it codes probabilities in the minimum possible amount of bits (unless you remember it’s real world and we don’t have infinite-precision arithmetic; still, the losses are very small and there’s no better coding method).

And in 1979 G.*.*. Martin (no, not the writer known for Tuf Voyaging) introduced range coding. Which is absolutely the same thing except that (de)coder maintains low and range values instead of low and high values in a conventional arithmetic coder (hence the name). Since it was kinda not covered by patents it got more popularity over the years.

And because dealing with arbitrary probabilities usually involves division by an arbitrary integer (and maybe increased coder precision) the further improvements were for sacrificing efficiency for speed until it boiled down to coding just two symbols and creating more elaborate models to code input that takes more than one bit. Arithmetic mode in JPEG seems to be simply feeding bits from Huffman codes and such to Q-coder (patented by IBM and thus extremely popular in the wild) that squeezes a bit more entropy out of them. Then you create an advanced version (MQ-coder) and push it into JPEG-2000 until binary coding is popular in image and video coding.

So, CABAC is:

  1. Context-based — yes, static coding would be a tad more effective than Huffman coding applied to bits (hint: it gives no savings). The problem that it’s the first step of the classical scheme: modelling and providing probability to the entropy coder;
  2. Adaptive — see above;
  3. Binary — true (just remember it codes not bits but most and least probable symbols);
  4. Arithmetic — actually it uses range coding;
  5. Coding — nothing to argue with here.

In general, the naming is a lot like USSR which was hardly a union, probably not soviet (whatever that word means—literal meaning is “belonging to the councils”), republics were just provinces (or local despoties) but it was more or less socialistic (to the point its ideology can be called international socialism and it was founded by SDAPR(B) too).

And I’d like to point out that CABAC should refer to the whole process of binarisation+context selection plus coding the result, not just the exact implementation used in ITU H.264 and ITU H.EVC (even if it’s called CBAC in AVS and “we have completely different coding” in VPx). And if you want an example of context-based adaptive binary non-arithmetic coding look at ELS used in G2M2 (and if you drop binary coding then you have examples in every other advanced lossless audio codec).

Comments are closed.