Indeo 3: codebooks

February 4th, 2023

As you probably remember, Indeo 3 has 21 codebook. In theory you’d expect them to correspond to coarser quantisers, in reality it’s not that easy. For starters, codebooks 8-15 trigger requantisation of the reference, i.e. in intra mode the top line used for prediction is replaced with coarser values. Yes, it really modifies previously decoded data. And for inter mode it does the same on the previous frame for the first line of the reference block. I’ve decided to enable codebooks 8-15 only for intra mode and not even attempt to use codebooks 16-20 at all. So, what can I achieve with those?

I’ve started experimenting with rate control so I encoded various kinds of samples (albeit small and short) and here are the results:

  • codebook set 0-7 and 8-15 give about the same frame sizes (i.e. it does not matter if you take e.g. codebook 2 or 10);
  • an average intra frame size decreases with codebook number but with inter frames some codebooks result in larger frames (sometimes codebook 2 resulted in larger P-frames than with any other codebook but codebook 6; in other case codebook 5 gave the smallest frames);
  • not forcing a codebook noticeably improves compression of P-frames compared to always using codebook 0 and has almost no effect on I-frames;
  • I-frame to P-frame size ratio varies greatly on the content: for realistic content with a lot of changes it is about 1:1, for videos with low motion and changes it can get to 1:3 or even more.

Maybe the compression ratio can be improved by fiddling with the (completely arbitrary) thresholds I use for some decisions (e.g. if the cell should be coded or marked as skipped). I’ve made them options so all zero people who want to play with it should be able to do that.

So far I think I’ll make rate control in a simple manner: all frames will be treated as potentially of equal size, codebook number will be adjusted depending on the expected and obtained frame sizes and if it overshoots I’ll try to re-encode it with a neighbouring codebook (as this may change frame size drastically).

I’ll write about the results when I have them.

So, are video codecs really done?

February 3rd, 2023

Yesterday that Derek’s talk at Demuxed got to me for about the fourth time and I was asked about my opinion on it as well. I can take the hint (eventually), so here’s what I think.

Unlike Derek I’m a major nobody with some interest on how the codecs are working, to the point that I’m not afraid to look at their binary specification and sometimes even implement a decoder. Anyway, I’ll try to give a short summary of the points he presents and what I think about it.
Read the rest of this entry »

Spoils of war

January 30th, 2023

Three hundred forty one day ago russia started the full-scale invasion of Ukraine. Recently I’ve stopped mentioning the war in my posts paradoxically for a good reason: initially it was unclear how it will go, now it’s apparent that russia is going to lose. Of course it still has some allies and potential left (just yesterday it shelled Kherson and launched a missile at my home city among other things) but it’s clear that it cannot win, especially when the other countries realized that and have started to help Ukraine. So today I want to rant about why there’s so little help and why it’s so late.

The verb “to spoil” has two major meaning: to go (or make something) bad, rotten; the second meaning is to rob or pillage. The nouns derived from it may also have two meanings, so while “spoils of war” usually means war trophies (or marauding at the battle line), you can also interpret it as things that became rotten because of war. Somehow both of this meanings apply to the current situation.

There were three major wars in the XX century: the Great War (1914-1918), The Second World War (1939-1945) and the Cold War (1945-1991). Some may argue that the last one was not a proper military conflict as there were mostly proxy conflicts like Korean War or Vietnam War but it involved a good deal of the world and the outcome was the same as with the other world wars—dissolution of the empires. WWI put an end to Austro-Hungarian and russian empire, WWII was the de facto end of British empire, Italian empire and Japanese Co-prosperity empire. Cold War ended with the dissolution of russian empire (again) and Nominally Federal Republic of Yugoslavia. The current war (I called if WWIII for a reason) may end up with russian empire dissolving for good—and if it takes current Chinese empire along with it I shan’t be sad either.

Probably the only thing World War I taught is that some ways of fighting the war (like chemical weaponry) are atrocious and should be banned (a cynic in me says the other countries agreed mostly because it was too ineffective). The aftermath of World War II was that Germany should not become strong again while other countries should do whatever they like. Cold War had been conducted taking into account the fact that the major players could destroy each other so they should not get into direct conflict or give the other party the reason to nuke them (indirect influence or military aid is fine though). Another outcome was the reinforcement of Westphalian sovereignty (i.e. the country can do whatever it likes unless it invades another country—especially if it has nukes).

After 1991 a lot of countries relaxed and decided that everything will be fine and they should not worry about anything ever again. Unfortunately if you don’t keep freedoms in check and reinforce them time from time you have a good chance of losing them. That’s what almost happened to Ukraine in 2010s, that’s what happened to Hungary, that’s what happened to Turkey (again) and so on. In other countries politicians become spineless—everything is good so we don’t need even a competent leader, a mediocre one who does not screw up much would do as well.

As the result we have countries with authoritarian regimes that can do whatever they like to themselves and other countries too timid to interfere—so when a big bully comes nobody opposes him. Look what happened when in 2008 russia invaded Georgia—USA “rebooted” the diplomatic relations and EU investigative mission did its best to ignore russian actions that started the war. In 2014 the same sort of people urged Ukraine to find a peaceful solution with its attacker as well. So no wonder that in 2022 when the West knew that russia will attack soon all they offered was evacuation of the government and token weapons good only for guerilla warfare. It took long months of losses and suffering for Ukraine to prove to the world that there is no reason to fear russia or to listen to its words. They made puppet referenda to declare Ukrainian territories as their own (including those they didn’t control) so they could “protect the integrity of their territory with nuclear weapons”—then Ukraine kicked them away from half of Kherson region and nothing happened. They always threaten to destroy “control centres” in case something happens but all they can really do is launch massive attacks to hurt civilians.

Let’s look at some countries to see how they degraded in the last three decades.

First is the USA of course. It has never fought wars on its own turf after the Civil War but it sent troops to various conflicts rather regularly. 9/11 was the event that made them start the (seemingly permanent) War on Terror™. That’s why American forces are skilled and well-equipped but in the same time it looks like an abstract thing to the most of the population, so a lot of politicians are eager to support russia and spread its agenda not for their money but because the current government is opposed to it.

The next will be Germany. Since strong Germany is met with suspicion that it’ll start a new world war, generations of German politicians served interests of any other country but their own. Later generation served russian interests—just look at Gazprom Schröder and Angela “Germany can’t get rid of dependency from russian gas” Merkel. With this option being no longer acceptable, the current chancellor seems to have switched to serve China instead (see the recent scandal with selling Hamburg port to them). Equally German military forces are a laughing matter: remember how a good deal of the munitions from their reserves they donated to Ukraine turned out past its due date or defective? remember how they increased military spendings last March to an unprecedented amount and failed to spend those money on anything? remember the recent performance of Puma IFVs? Sometimes I think the current German stance about giving Ukraine its military technologies is caused not by the notion “we can’t have Nazism so let’s enjoy russia practising it like a good student” but rather by the potential fear that their technologies may be not working at all. I’d also name the other factor responsible for the current situation: not performing a lustration and banning socialism after reuniting with GDR (Ukraine paid dearly for the same mistake).

And speaking about no lustration or banning socialism, we have Austria. After 1945 this homeland of putin’s spiritual father pretended to be a victim and stayed “neutral”. While Germany had trials for Nazis, Austria let its own live in peace for the rest of their lives. No wonder that high Austrian officials could be bought by russians (well, I hope they were bought and not simply shared the same ideology) to the point that Austrian state security worked directly in russian interests. At least they sometimes correct their mistakes.

Another “neutral” state would be Switzerland. On the one hand they try to avoid direct involvement in all conflicts, hence their law for forbidding transferring their military equipment to the fighting parties. On the other hand they are not above profiting from a trade with various parties (and thanks to the lax export control their military technologies end up in russia). I also joke that they’ll readily recall all their military stuff as soon as the country possessing them gets involved in a conflict. But a decade or two ago their banks lost the reputation of safe heavens for various criminals (thanks to the pressure from USA and data leaks), probably their neutrality will not remain the same for long either.

Continuing the streak of “neutral” countries, we have Israel. Its neutrality is based on the country being the Jewish state so they’d rather not get into conflict with any other country in fear of local Jews being persecuted (of course this does not apply to the nearby countries that try to destroy Israel already). But the current prime minister decided that he’d rather be friends with russia and thus does everything to prevent Ukrainian people even of Jewish origin to take refuge in the country (despite this being one of their original goals and duties). Considering how long he has been in power it is no surprise. I’ve read that thanks to his actions Israel loses support of other countries like USA. After all, it they don’t respect their own people why should anybody else respect them?

And finally for something mixed, namely France. Back in the day Mark Twain wrote a chapter titled French and the Comanches that did not make it into his A Tramp Abroad book. There he (half-jokingly) argues that French as a nation stand below Comanches—because the latter have not committed as much atrocities and never were as inventive at them either. The same applies to modern russia for the same reason—you’d not expect underdeveloped nations to retrofit anti-air missiles for S-300 system to send at the ground targets (they don’t really care what it hits, be it on the enemy or their own territory—and those missiles can’t be precise on ground targets in principle so it’s purely terrorism and not a warfare). I find a lot of similarities between France and russia in terms of their imperial politics (the same attitude to foreign languages, the same dedicated role of the capital compared to all other cities, even African de facto colonies are the same!). But I’ll leave this rant to another time and now I’ll just say that maybe because of this class solidarity a lot of French companies feel well on russian market and have no intent of leaving it soon. And of course the current president managed to create a new verb—”to macron”, meaning expressing a deep concern without doing anything substantial (of course various bureaucratic organisations like United Nations have been practising it since ages but his behaviour was worth commending).

War is a terrible thing that I don’t wish to anybody beside those who actively call it onto others. But sometimes it’s the only harsh enough measure to make people (and countries) wake up from delusions and start doing something. But as you can see, some still learn nothing from it.

Indeo 3: cell coding

January 30th, 2023

So we partitioned out the frame and now have to code the cell data. How to pick the best parameters in this case?

The patent suggest calculating vertical and horizontal differences (i.e. differences between top-bottom and left-right neighbours) and depending on how large they are select one of the modes. Codebook selection is not reviewed at all. The reference encoder calculates those differences and uses them to set both cell coding mode and select codebook. I.e. if both differences are large use mode 0 (fine-grained coding), if only one difference is large use mode 3 or 11, otherwise use mode 10. And a ratio of the differences is clipped, multiplied by a magic factor, then by a rate control factor and used as an index in a special magic table to select codebook.

Since my goal is to learn something new instead of simply replicating something existing, I took a completely different approach (that should contain less magic). Mode selection is done by comparing differences and amending it if I decide to use two codebooks. I used the fact that first eight codebooks mostly have differences in form kN+1 and the next eight codebooks have differences in form kM. So I simply calculate for each codebook how many delta values are represented best with those formulas and select the best fitting one. Also I calculate it separately for the even and odd lines (the histograms can be merged later to give a total statistics) so I can select the appropriate codebook or codebook pair for the coding mode. Maybe I’ll have to adjust the scheme for the rate control but it’ll happen later. Side note: Indeo 3 specifies a per-frame set of 16 codebook pairs that all cells should use and global codebook index offset so single-codebook modes may use additional 5 codebooks; the set seems to be static and has regular structure and I’m not sure that global codebook index offset is ever used.

That’s it. The rest of the things should be rather trivial: I’ve written how to perform motion search before, rate/quality control has never been great in the original codec (maybe I’ll report how I did it when I get to it), zero run compression is nothing special either. There’s not much to write until I fix some bugs, improve compression, introduce rate control and validate it against the reference decoder. And that will take a long time…

Indeo 3: splitting the frame

January 28th, 2023

As mentioned in the previous post, Indeo 3 splits frame into cells using binary trees and they’re coded using one of several possible modes. In reality it’s more complex: there’s a primary tree that splits frame into regions and tells how to code them (intra or inter) and those regions themselves can be split using another binary tree to tell which coding method to use (or to skip decoding it entirely). See, it had tree coding, prediction units and coding units two decades before H.265! And slices as well: it divides data into strips 160 pixels wide too.

Splitting the frame optimally is practically impossible task (because of its combinatorial complexity). In reality though it’s much simpler: first we split plane into 160-pixel wide (or 40-pixel wide for chroma) strips then split them along the largest dimension until we get cells of maximum acceptable size (which seems to be 767 pixels but the encoder seems to handle up to 2048 pixels in a coded cell). Then it’s up to a secondary cell coding.

From what I could gather in the encoder, it also tries to split secondary cells if they’re above the limit but it’s the same value used in the reference encoder even if it could be set separately.

Since my goal is to learn something new instead of re-creating something existing, I use a different approach: initial mode is selected by the relation between horizontal and vertical differences (if both are too high I try to split the cell and try again). Similarly for inter mode I first try to see whether the cell can be coded as inter (and if splitting it will make at least one of the sub-cells code as inter) and if not then I resort to intra coding.

There is probably a better way than brute force to find out the optimal splitting but for lack of it a simple heuristic should do.

Cell coding mode and codebook selection is a topic best left for the next time.

Indeo 3 overview

January 27th, 2023

The overall idea behind this codec is simple: a frame is split into cells of variable size (the patent says “a roughly regular grid of cells”) using a binary tree, each cell can then either be coded in intra mode (differences to the previous line) or inter mode (differences to some region in the previous frame). Coding is done by splitting cell into 4×4, 4×8, 8×8 or 8×4 blocks and using one or two of 21 codebook to code pairs of differences (with some tricks to compress small differences and zero runs even further).

The patent describes thirteen different modes, the decoders I know about support only some of those:

  • mode 0—code 4×4 blocks using a single codebook;
  • mode 1—code 4×4 blocks using two different codebooks for even and odd lines;
  • mode 2—code 4×4 blocks using two different codebooks but the second one is used only for the second line (no known decoder supports that mode);
  • mode 3—code 4×8 block using a single codebook by coding differences to the even lines and interpolating the odd ones;
  • mode 4—the same as mode 3 but with two codebooks;
  • mode 5—very similar to mode 3 but with a possibility to add a correction to the interpolated lines (since it involves writing single bits that no other part of the codec does, no known implementation supports it);
  • mode 6—like mode 5 but with two codebooks (and equally unsupported by anything known);
  • mode 7—code 4×4 blocks with bit flags for telling which dyad to code (no known decoder supports this);
  • mode 8—the same as mode 7 but with two codebooks (of course it’s unsupported);
  • mode 9—the same as mode 7 but with the second codebook specially for the second line (equally unsupported);
  • mode 10—code 8×8 block using a single codebook by either duplicating pixels on even lines and interpolating odd lines (for intra) or scaling each delta for 2×2 block (in inter mode);
  • mode 11—code 4×8 (inter only) block using corrector repeated for each odd line;
  • mode 12—mode 11 with two codebooks (only VfW version supports it).

Considering the internal implementation details (e.g. using arrays for opcode handling or not), I’d say that QuickTime and XAnim versions of the decoder are based on the same porting kit code supplied by Intel while Video for Windows version uses the different codebase (it’s not just an encoder being present and mode 12 support, it’s also the way how many tables are generated in runtime while they are static in other decoders, not using the opcode tables and other minor things).

But before we start to code cells we need to perform the initial frame splitting and the next post should be about that.

Starting yet another useless encoder

January 26th, 2023

Even before I started to write my series of posts on FFhistory, I had another work in progress already which I’m now making public in order not to chicken out (as I did several times already). I’m talking about Indeo 3 encoder.

Why Indeo 3 of all possible things? It’s both not your conventional DCT-based codec and it’s widespread enough to be of some limited use for me (being present in AVI, MOV and VMD containers, only Cinepak is more ubiquitous). I’m not as good as Mike Melanson but I’m willing to try my hoof at it.

The funny thing is, there’s an opensource decoder for it and even a decent description in US patent 5,386,232 from 1995 (so it’s expired already and anybody can write an encoder for it). The problem is that those two sources don’t match between each other and somewhat disagree with the official binary specification (I’m pretty sure that both Indeo3 decoders were REd from XAnim module). And Ghidra does not like VfW binary (maybe it’ll like the version inside QT6 better) so I can’t easily refer to it either.

Anyway, I attempted and gave up writing an encoder for Indeo 3 several times because of its perceived combinatoric complexity. First you need to split frame recursively into blocks—how to select them? Then you need to select one of the coding modes (again, how?) and codebooks (same question). Trying to think of a reasonable way to implement it all made me shudder and give up until I finally read the format description and persisted enough to write at least something working (side note: I also have the same problem with TrueMotion 1 encoder which I also want to write one day, hopefully it’ll be easier now).

Also I tried to look into the encoder implementation and found it as a bunch of magic numbers at work. I’m not joking, during initialisation it seems to set several dozens of various integers and floats and use them for various coding decisions (at least what I could understand from it is that codebook selection is kinda tied to the internal quantiser parameter which is calculated depending on bitrate/quality—and various magic numbers).

So I want to document how this codec works, what differs in the different descriptions of it and how my encoder decides what to use in different situations. This should amount to another dozen of posts that nobody will read.

A quick glance at WA

January 21st, 2023

A certain Paul B. asked me to look at WA (aka WavArc) and it turned out to be rather interesting. For starters, it’s the only lossless audio archiver and not compressor I’m aware of (the difference is an ability to store multiple files in single archive). Of course there are things like Rar or WinZip with special multimedia compression modes but they’re general-purpose archives with content-specific compression methods and not audio-only archivers.

There are two versions of the executable: 32-bit DOS one and 16-bit DOS one (where all 32-bit integer and FPU operations are emulated). The latter turned out in showcase what Ghidra lacks in supporting old DOS executables, so eventually I tried 32-bit version. Even if I had to load it manually, luckily it turned out to have PE-like header for the loader so it was no problem figuring out segment mapping. After that it was a piece of cake.

Essentially there are three compression modes: store (mode 0), Shorten-like fixed prediction and fixed Rice codes (modes 1-4) and mode 5 with LPC prediction and residue coding using either fixed Rice codes or arithmetic coder (with fixed model transmitted before residues).

Overall, it’s a curious piece of software that was interesting to look at.

FFhistory: conclusion

January 20th, 2023

Now that I’ve finished remembering various developers it’s time to evaluate their impact and how it would be without certain them.
Read the rest of this entry »

FFhistory: Paul B Mahol

January 19th, 2023

This guy appeared in 2011 from nowhere and in a sense he looks like an embodiment of the project. You can find in him the same productivity and unwillingness to meet other people as in Michael Niedermayer, the same talents to reverse engineer codecs as in many people mentioned in the post about them, the same diva behaviour as in Baptiste Coudurier, the same versatility as elenril, the same unwillingness to finish Bink2 decoder as Luca’s unwillingness to finish Opus decoder (I’m still waiting for both BTW) and the same abrasive personality as in many developers from MPlayer. In other words, a guy with strong positive and negative sides.
Read the rest of this entry »