The overall idea behind this codec is simple: a frame is split into cells of variable size (the patent says “a roughly regular grid of cells”) using a binary tree, each cell can then either be coded in intra mode (differences to the previous line) or inter mode (differences to some region in the previous frame). Coding is done by splitting cell into 4×4, 4×8, 8×8 or 8×4 blocks and using one or two of 21 codebook to code pairs of differences (with some tricks to compress small differences and zero runs even further).
The patent describes thirteen different modes, the decoders I know about support only some of those:
- mode 0—code 4×4 blocks using a single codebook;
- mode 1—code 4×4 blocks using two different codebooks for even and odd lines;
- mode 2—code 4×4 blocks using two different codebooks but the second one is used only for the second line (no known decoder supports that mode);
- mode 3—code 4×8 block using a single codebook by coding differences to the even lines and interpolating the odd ones;
- mode 4—the same as mode 3 but with two codebooks;
- mode 5—very similar to mode 3 but with a possibility to add a correction to the interpolated lines (since it involves writing single bits that no other part of the codec does, no known implementation supports it);
- mode 6—like mode 5 but with two codebooks (and equally unsupported by anything known);
- mode 7—code 4×4 blocks with bit flags for telling which dyad to code (no known decoder supports this);
- mode 8—the same as mode 7 but with two codebooks (of course it’s unsupported);
- mode 9—the same as mode 7 but with the second codebook specially for the second line (equally unsupported);
- mode 10—code 8×8 block using a single codebook by either duplicating pixels on even lines and interpolating odd lines (for intra) or scaling each delta for 2×2 block (in inter mode);
- mode 11—code 4×8 (inter only) block using corrector repeated for each odd line;
- mode 12—mode 11 with two codebooks (only
VfW
version supports it).
Considering the internal implementation details (e.g. using arrays for opcode handling or not), I’d say that QuickTime
and XAnim
versions of the decoder are based on the same porting kit code supplied by Intel while Video for Windows
version uses the different codebase (it’s not just an encoder being present and mode 12 support, it’s also the way how many tables are generated in runtime while they are static in other decoders, not using the opcode tables and other minor things).
But before we start to code cells we need to perform the initial frame splitting and the next post should be about that.