On Variable-Length Codes Names

After seeing the recent commit in Libav this rant simply wrote itself.

People, Solomon Wolf Golomb was a genius whose work influenced various areas of science (I’ve read about his work in Martin Gardner’s books plus some of his papers) but please stop attributing to him stuff he did not invent. I’m talking about universal variable-length codes for integers (or Xine for short).

He has introduced (in late sixties) a specific kind of Xine for optimal coding of integers with certain distributions (I’ve recommended to read the paper introducing them before, it’s awesome and I wish more papers were written like that one). Those codes have a parameter k that is distribution parameter and also it’s used to split code into two parts—an unary prefix coding N/k and log2k bits coding N%k (for some values it’s rounded down, for another it’s rounded up). Later Robert Rice had a similar idea and independently introduced codes that were Golomb codes with parameter 2^k (and thus they’re often called Rice codes and they’re used more because they are easier to manipulate on computer). And that’s all—there are no other Golomb codes.

Yet thanks to ITU H.264 standard (aka GNUMPEG-4 AVC) we have exp-Golomb codes and interleaved exp-Golomb codes. I don’t know who decided on the name but it’s misleading and wrong (but because it’s in the standard people insist on using it; that also shows how much people designing codecs know about general compression methods). Maybe if some other Xines were rediscovered they’d go under equally ridiculous names like geo(metric)-Golomb or norm(al)-Golomb or quad-Golomb or recursive Golomb codes (because people have never heard of Levenstein coding).

Again, back in seventies Peter Elias proposed a scheme for coding: let’s call unary code (i.e. the one that codes an integer as a series on one value terminated with the other value like 000001 or 1111110) alpha code and fixed bit representation of an integer beta code then we can arrive to gamma code that combines both.

So “exp-Golomb” code is really Elias gamma code? NOPE! Like with TV interlacing came first and actual Elias gamma code is what is incorrectly called “interleaved exp-Golomb” (i.e. first you have flag bit telling whether the code is over or there are more bits left, then data bit, then flag bit again, rinse, repeat). And progressive version is Elias gamma prime has unary prefix i.e. alpha code for the length of the second part concatenated with beta code for that part (I’ve rechecked the original paper freely available at sci-hub—because the only time I paid for IEEE papers access was when my scientific supervisor sent me with money to pay for IEEE membership renewal for our chair). Then you can construct delta code that codes integer value in three parts (actual bits, their length and length of the length part) and jump to omega codes that code mostly lengths of the following length part (very meta!).

So there’s another thing to dislike in H.264 standard beside all interlaced modes and scalable and multi-view coding, it’s forcing a wrong terminology on the world (feel free to correct me if there were earlier uses). The same way there are confusions between arithmetic and range coding, various binary coders are not free from it too. But that’s a rant for another day.

3 Responses to “On Variable-Length Codes Names”

  1. sasshka says:

    This is great post, thank you. VLC/golomb/H.264 relationship is clearer to me now

  2. […] Kostya's Multimedia Research « On Variable-Length Codes Names […]

  3. […] hand-held consoles. As for the second part—it uses Elias Gamma’ codes (like H.264 does—under different name) and suspiciously similar spatial prediction. Obviously it also differs a lot because it’s […]