On Variable-Length Codes Names « Kostya's Boring Codec World

On Variable-Length Codes Names

After seeing the recent commit in Libav this rant simply wrote itself.

People, Solomon Wolf Golomb was a genius whose work influenced various areas of science (I’ve read about his work in Martin Gardner’s books plus some of his papers) but please stop attributing to him stuff he did not invent. I’m talking about universal variable-length codes for integers (or Xine for short).

He has introduced (in late sixties) a specific kind of Xine for optimal coding of integers with certain distributions (I’ve recommended to read the paper introducing them before, it’s awesome and I wish more papers were written like that one). Those codes have a parameter k that is distribution parameter and also it’s used to split code into two parts—an unary prefix coding N/k and log₂k bits coding N%k (for some values it’s rounded down, for another it’s rounded up). Later Robert Rice had a similar idea and independently introduced codes that were Golomb codes with parameter 2^k (and thus they’re often called Rice codes and they’re used more because they are easier to manipulate on computer). And that’s all—there are no other Golomb codes.

Yet thanks to ITU H.264 standard (aka ~~GNU~~MPEG-4 AVC) we have exp-Golomb codes and interleaved exp-Golomb codes. I don’t know who decided on the name but it’s misleading and wrong (but because it’s in the standard people insist on using it; that also shows how much people designing codecs know about general compression methods). Maybe if some other Xines were rediscovered they’d go under equally ridiculous names like geo(metric)-Golomb or norm(al)-Golomb or quad-Golomb or recursive Golomb codes (because people have never heard of Levenstein coding).

Again, back in seventies Peter Elias proposed a scheme for coding: let’s call unary code (i.e. the one that codes an integer as a series on one value terminated with the other value like 000001 or 1111110) alpha code and fixed bit representation of an integer beta code then we can arrive to gamma code that combines both.

So “exp-Golomb” code is really Elias gamma code? NOPE! Like with TV interlacing came first and actual Elias gamma code is what is incorrectly called “interleaved exp-Golomb” (i.e. first you have flag bit telling whether the code is over or there are more bits left, then data bit, then flag bit again, rinse, repeat). And progressive version is Elias gamma prime has unary prefix i.e. alpha code for the length of the second part concatenated with beta code for that part (I’ve rechecked the original paper freely available at sci-hub—because the only time I paid for IEEE papers access was when my scientific supervisor sent me with money to pay for IEEE membership renewal for our chair). Then you can construct delta code that codes integer value in three parts (actual bits, their length and length of the length part) and jump to omega codes that code mostly lengths of the following length part (very meta!).

So there’s another thing to dislike in H.264 standard beside all interlaced modes and scalable and multi-view coding, it’s forcing a wrong terminology on the world (feel free to correct me if there were earlier uses). The same way there are confusions between arithmetic and range coding, various binary coders are not free from it too. But that’s a rant for another day.

This entry was posted on Tuesday, May 31st, 2016 at 2:28 pm and is filed under Useless Rants. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

3 Responses to “On Variable-Length Codes Names”

sasshka says:

May 31, 2016 at 3:01 pm

This is great post, thank you. VLC/golomb/H.264 relationship is clearer to me now
On H.264 Coding Schemes Names « Kostya's Wild Codec World says:

June 20, 2016 at 12:13 pm

[…] Kostya's Multimedia Research « On Variable-Length Codes Names […]
A Rant on Actimagine VX Codec « Kostya's Wild Codec World says:

July 2, 2016 at 7:13 am

[…] hand-held consoles. As for the second partâ€”it uses Elias Gamma’ codes (like H.264 doesâ€”under different name) and suspiciously similar spatial prediction. Obviously it also differs a lot because it’s […]