A failed attempts on writing Duck TrueMotion S encoder

February 23rd, 2023

So, my attempt to write a semi-decent TrueMotion 1 encoder has failed (mostly because I’m too annoyed to continue it). Here I’ll describe how the codec works and what I’ve implemented.

In essence, Horizons Technology PVEZ is a rather simple delta-compression based codec that codes RGB15 (or ARGB20) and RGB24 using prediction from three neighbours and Tunstall codes (I’m aware only of one other codec, CRI P256, that employs it). For convenience there are three possible fixed sets of deltas and three fixed codebooks as well. Videos from (3DO version IIRC) Star Control II: The Ur-Quan Masters used custom codebooks (data for cutscenes was stored in several files and one of them was the codebook specification), and later TM2X allowed using per-frame custom codebooks and deltas but nobody remembers it. The second revision of the codec (do not confuse it with TrueMotion 2 though) introduced inter frames where some of the 2×4 blocks could be marked as skipped.

Initially I had no idea on how to do it properly so I tried brute forcing it by creating a search tree limited to maximum of 256 nodes at each level but as you can expect it took about a minute to encode two frames in VGA resolution. Thus I decided to look at the codebook closer and eventually found out that it’s a prefix one (i.e. for each chain of codes there’s its non-empty prefix in the codebook as well) so I can use greedy approach by simply accumulating codes in a sequence and writing codebook entry ID when the sequence can’t be extended further (or when adding the next code forms a sequence not in the codebook). Which leaves the question of deltas.

There are two kinds of deltas there, both occurring in pairs: C-deltas that update red and blue components (depending on coding parameters there may be 1-4 deltas per 2×4 block of pixels) and Y-deltas that update all pixel components (and for all pixels as well). The problem here was to keep deltas in order so they produce sane pixel values (i.e. without wraparounds) and that’s where I failed. I used the same approach as the decoders and grouped delta pairs into single value. The problem is that I could not keep the result value from causing overflows even if I tried all possible deltas and did not check C-deltas result (as Y-deltas are added immediately after that). And I also made a mistake of using pixel value with its components stored separately (the deltas apparently exploit carries and subtracting with borrows for higher components). I suppose I’d have better luck if I use 32-bit pixel value (converting it to bytes for checking the differences and such) and if I use individual deltas and probably with a trellis search for 4-8 deltas to make sure the result does not deviate much from the original pixel values…—but I was annoyed enough at this point so I simply gave up. And that is before getting to the stage when I have to figure out how to select delta values set (probably just calculate the deltas for the whole frame and see what set fits there the best), what codebook to pick and how to control bitrate (by zeroing small deltas?).

Oh well, I’m sure I’ll find something else to work at.

P.S. I’ve also tried to look at the reference encoder but CODUCK.DLL was not merely a horrible pun but an obfuscated (you were supposed to pay for the encoder and use serial numbers and such after all) 16-bit code that made Ghidra decompiler commit suicide so I gave up on it as well.

P.P.S. I might return to it one day but it seems unlikely as this codec is not that interesting or useful for me.

Indeo 3 encoder: done

February 16th, 2023

After fixing some bugs I think my encoder requires no further improvement (there are still things left out there to improve but they are not necessary). The only annoying problem is that decoding some videos with the original VfW binary gives artefacts. Looks like this happens because of its peculiar way to generate and then handle codebooks: corrector pairs and quads are stored as single 32-bit word and its top bit is also used to signal if it’s a dyad or a quad. And looks like for some values (I suspect that for large negative ones) the system does not work so great so while e.g. XAnim module does what is expected, here it mistakes a result for another corrector type and decodes the following bytestream in a wrong way. Of course I’m not going to deal with that annoyance and I doubt anybody will care.

Also I’ve pruned one stupid bug from my MS Video 1 encoder so it should be done as well. The third one, Cinepak, is in laughable state (well, it encodes data but it does not do that correctly let alone effectively); hopefully I’ll work on it later.

For now, as I see no interesting formats to support (suggestions are always welcome BTW), I’ll keep writing toy encoders that nobody will use (and hopefully return to improving Cinepak encoder eventually).

Indeo 3, the MP3 of video codecs

February 14th, 2023

I know that MPEG-4 ASP is a better-known candidate for this role, but Indeo 3 is a strong contender too.

Back in the day it was ubiquitous, patented, and as I re-implemented a decoder for it I discovered another fun similarity with MP3: checksums.

Each Indeo 3 frame has a checksum value embedded in it, calculated as XOR of all pairs of pixels. I had an intent to use it to validate the decoder output but after playing a bit I’ve given up. Some files agree on checksums, others disagree while the output from the reference decoder is exactly the same, in yet another files checksums are correct but byte-swapped and one file has only zeroes for checksums. This is exactly like MP3, and like there Indeo 3 decoders ignore that field.

Also I’ve encountered other fun quirks. For example, one Indeo file is 160×120 but its frame header claims it’s 160×240 (but you still have to decode it as 160×120). You’d think it’s the rule but I know some VMD files from Urban Runner game where the first or last frame are double the size. Another file errors out on the first frame because of the inappropriate opcode encountered (essentially “skip to the line 2” right after “skip to the line 2”) but it turns out that VfW decoder does not check that case and simply uses that opcode as a codebook index.

At least my new decoder should be good enough to iron out the obvious bugs from the encoder and after that I shall forget about that codec again for a long time.

Revisiting MSVideo1 encoder

February 8th, 2023

Recently somebody asked me a question about my MS Video 1 encoder (not the one in NihAV though) and I’ve decided to look if my current encoder can be improved, so I took Ghidra and went to read the binary specification.

Essentially it did what I expected: it understands quality only, for which it calculates thresholds for skip and fill blocks to be used immediately, clustering is done in the usual K-means way and the only curious trick is that it used luminance for that.

So I decided to use that idea for improving my own encoder. I ditched the generic median cut in favour of specially crafted clustering in two groups (I select the largest cluster axis—be it luma, red, green or blue—split 4 or 16 pixels into two groups by being above average or not and calculate the average of those two groups). This made encoding about two times faster. I’ve also fixed a bug with 8-colour blocks so now it encodes data properly (previously it would result in a distorted block). And of course I’ve finally made quality affect encoding process (also by generating thresholds, but with a different formula—unlike the original my encoder uses no floating-point maths anywhere).

Also I’ve added palette mode support. The idea is simple: internally I operate on pixel quads (red, green, blue, luma) so for palette mode I just need to replace an actual pixel value with the index of the most similar palette entry. For that task I reused one of the approaches from my palettiser (it should be faster than iterating over the whole palette every time). Of course the proper way would be to map colour first to have the proper distortion calculated (because the first suitable colour may be far from perfect) but I decided not to pursue this task further, even if it results in some badly-coded features sometimes. It’s still not a serious encoder after all.

Now this member of the early 90s video codecs ruling triumvirate should be good enough. Cinepak encoder is still rather primitive so I’ll have to re-check it. Indeo 3 encoder seems to produce buggy output on complex high-motion scenes (I suspect it’s related to the number of motion vectors exceeding the limit) but it will have to wait until I rewrite the decoder. And then hopefully more interesting experiments will happen.

Indeo 3 encoder: done

February 6th, 2023

I’ve done what I wanted with the encoder, it seems to work and so I declare it to be finished. It can encode videos that other decoders can decode, it has some adjustable options and even a semblance of rate control.

Of course I’ll return to it if I ever use it and find some bugs but for now I’ll move to other things. For instance, Indeo 3 decoder needs to be rewritten now that I understand the codec better. Also I have some ideas for improving MS Video 1 encoder. And there’s TrueMotion 1 that I wanted to take a stab at. And there are some non-encoder things as well.

There’s a lot of stuff to keep me occupied (provided that I actually get myself occupied with it in the first place).

Indeo 3: codebooks

February 4th, 2023

As you probably remember, Indeo 3 has 21 codebook. In theory you’d expect them to correspond to coarser quantisers, in reality it’s not that easy. For starters, codebooks 8-15 trigger requantisation of the reference, i.e. in intra mode the top line used for prediction is replaced with coarser values. Yes, it really modifies previously decoded data. And for inter mode it does the same on the previous frame for the first line of the reference block. I’ve decided to enable codebooks 8-15 only for intra mode and not even attempt to use codebooks 16-20 at all. So, what can I achieve with those?

I’ve started experimenting with rate control so I encoded various kinds of samples (albeit small and short) and here are the results:

  • codebook set 0-7 and 8-15 give about the same frame sizes (i.e. it does not matter if you take e.g. codebook 2 or 10);
  • an average intra frame size decreases with codebook number but with inter frames some codebooks result in larger frames (sometimes codebook 2 resulted in larger P-frames than with any other codebook but codebook 6; in other case codebook 5 gave the smallest frames);
  • not forcing a codebook noticeably improves compression of P-frames compared to always using codebook 0 and has almost no effect on I-frames;
  • I-frame to P-frame size ratio varies greatly on the content: for realistic content with a lot of changes it is about 1:1, for videos with low motion and changes it can get to 1:3 or even more.

Maybe the compression ratio can be improved by fiddling with the (completely arbitrary) thresholds I use for some decisions (e.g. if the cell should be coded or marked as skipped). I’ve made them options so all zero people who want to play with it should be able to do that.

So far I think I’ll make rate control in a simple manner: all frames will be treated as potentially of equal size, codebook number will be adjusted depending on the expected and obtained frame sizes and if it overshoots I’ll try to re-encode it with a neighbouring codebook (as this may change frame size drastically).

I’ll write about the results when I have them.

So, are video codecs really done?

February 3rd, 2023

Yesterday that Derek’s talk at Demuxed got to me for about the fourth time and I was asked about my opinion on it as well. I can take the hint (eventually), so here’s what I think.

Unlike Derek I’m a major nobody with some interest on how the codecs are working, to the point that I’m not afraid to look at their binary specification and sometimes even implement a decoder. Anyway, I’ll try to give a short summary of the points he presents and what I think about it.
Read the rest of this entry »

Spoils of war

January 30th, 2023

Three hundred forty one day ago russia started the full-scale invasion of Ukraine. Recently I’ve stopped mentioning the war in my posts paradoxically for a good reason: initially it was unclear how it will go, now it’s apparent that russia is going to lose. Of course it still has some allies and potential left (just yesterday it shelled Kherson and launched a missile at my home city among other things) but it’s clear that it cannot win, especially when the other countries realized that and have started to help Ukraine. So today I want to rant about why there’s so little help and why it’s so late.

The verb “to spoil” has two major meaning: to go (or make something) bad, rotten; the second meaning is to rob or pillage. The nouns derived from it may also have two meanings, so while “spoils of war” usually means war trophies (or marauding at the battle line), you can also interpret it as things that became rotten because of war. Somehow both of this meanings apply to the current situation.

There were three major wars in the XX century: the Great War (1914-1918), The Second World War (1939-1945) and the Cold War (1945-1991). Some may argue that the last one was not a proper military conflict as there were mostly proxy conflicts like Korean War or Vietnam War but it involved a good deal of the world and the outcome was the same as with the other world wars—dissolution of the empires. WWI put an end to Austro-Hungarian and russian empire, WWII was the de facto end of British empire, Italian empire and Japanese Co-prosperity empire. Cold War ended with the dissolution of russian empire (again) and Nominally Federal Republic of Yugoslavia. The current war (I called if WWIII for a reason) may end up with russian empire dissolving for good—and if it takes current Chinese empire along with it I shan’t be sad either.

Probably the only thing World War I taught is that some ways of fighting the war (like chemical weaponry) are atrocious and should be banned (a cynic in me says the other countries agreed mostly because it was too ineffective). The aftermath of World War II was that Germany should not become strong again while other countries should do whatever they like. Cold War had been conducted taking into account the fact that the major players could destroy each other so they should not get into direct conflict or give the other party the reason to nuke them (indirect influence or military aid is fine though). Another outcome was the reinforcement of Westphalian sovereignty (i.e. the country can do whatever it likes unless it invades another country—especially if it has nukes).

After 1991 a lot of countries relaxed and decided that everything will be fine and they should not worry about anything ever again. Unfortunately if you don’t keep freedoms in check and reinforce them time from time you have a good chance of losing them. That’s what almost happened to Ukraine in 2010s, that’s what happened to Hungary, that’s what happened to Turkey (again) and so on. In other countries politicians become spineless—everything is good so we don’t need even a competent leader, a mediocre one who does not screw up much would do as well.

As the result we have countries with authoritarian regimes that can do whatever they like to themselves and other countries too timid to interfere—so when a big bully comes nobody opposes him. Look what happened when in 2008 russia invaded Georgia—USA “rebooted” the diplomatic relations and EU investigative mission did its best to ignore russian actions that started the war. In 2014 the same sort of people urged Ukraine to find a peaceful solution with its attacker as well. So no wonder that in 2022 when the West knew that russia will attack soon all they offered was evacuation of the government and token weapons good only for guerilla warfare. It took long months of losses and suffering for Ukraine to prove to the world that there is no reason to fear russia or to listen to its words. They made puppet referenda to declare Ukrainian territories as their own (including those they didn’t control) so they could “protect the integrity of their territory with nuclear weapons”—then Ukraine kicked them away from half of Kherson region and nothing happened. They always threaten to destroy “control centres” in case something happens but all they can really do is launch massive attacks to hurt civilians.

Let’s look at some countries to see how they degraded in the last three decades.

First is the USA of course. It has never fought wars on its own turf after the Civil War but it sent troops to various conflicts rather regularly. 9/11 was the event that made them start the (seemingly permanent) War on Terror™. That’s why American forces are skilled and well-equipped but in the same time it looks like an abstract thing to the most of the population, so a lot of politicians are eager to support russia and spread its agenda not for their money but because the current government is opposed to it.

The next will be Germany. Since strong Germany is met with suspicion that it’ll start a new world war, generations of German politicians served interests of any other country but their own. Later generation served russian interests—just look at Gazprom Schröder and Angela “Germany can’t get rid of dependency from russian gas” Merkel. With this option being no longer acceptable, the current chancellor seems to have switched to serve China instead (see the recent scandal with selling Hamburg port to them). Equally German military forces are a laughing matter: remember how a good deal of the munitions from their reserves they donated to Ukraine turned out past its due date or defective? remember how they increased military spendings last March to an unprecedented amount and failed to spend those money on anything? remember the recent performance of Puma IFVs? Sometimes I think the current German stance about giving Ukraine its military technologies is caused not by the notion “we can’t have Nazism so let’s enjoy russia practising it like a good student” but rather by the potential fear that their technologies may be not working at all. I’d also name the other factor responsible for the current situation: not performing a lustration and banning socialism after reuniting with GDR (Ukraine paid dearly for the same mistake).

And speaking about no lustration or banning socialism, we have Austria. After 1945 this homeland of putin’s spiritual father pretended to be a victim and stayed “neutral”. While Germany had trials for Nazis, Austria let its own live in peace for the rest of their lives. No wonder that high Austrian officials could be bought by russians (well, I hope they were bought and not simply shared the same ideology) to the point that Austrian state security worked directly in russian interests. At least they sometimes correct their mistakes.

Another “neutral” state would be Switzerland. On the one hand they try to avoid direct involvement in all conflicts, hence their law for forbidding transferring their military equipment to the fighting parties. On the other hand they are not above profiting from a trade with various parties (and thanks to the lax export control their military technologies end up in russia). I also joke that they’ll readily recall all their military stuff as soon as the country possessing them gets involved in a conflict. But a decade or two ago their banks lost the reputation of safe heavens for various criminals (thanks to the pressure from USA and data leaks), probably their neutrality will not remain the same for long either.

Continuing the streak of “neutral” countries, we have Israel. Its neutrality is based on the country being the Jewish state so they’d rather not get into conflict with any other country in fear of local Jews being persecuted (of course this does not apply to the nearby countries that try to destroy Israel already). But the current prime minister decided that he’d rather be friends with russia and thus does everything to prevent Ukrainian people even of Jewish origin to take refuge in the country (despite this being one of their original goals and duties). Considering how long he has been in power it is no surprise. I’ve read that thanks to his actions Israel loses support of other countries like USA. After all, it they don’t respect their own people why should anybody else respect them?

And finally for something mixed, namely France. Back in the day Mark Twain wrote a chapter titled French and the Comanches that did not make it into his A Tramp Abroad book. There he (half-jokingly) argues that French as a nation stand below Comanches—because the latter have not committed as much atrocities and never were as inventive at them either. The same applies to modern russia for the same reason—you’d not expect underdeveloped nations to retrofit anti-air missiles for S-300 system to send at the ground targets (they don’t really care what it hits, be it on the enemy or their own territory—and those missiles can’t be precise on ground targets in principle so it’s purely terrorism and not a warfare). I find a lot of similarities between France and russia in terms of their imperial politics (the same attitude to foreign languages, the same dedicated role of the capital compared to all other cities, even African de facto colonies are the same!). But I’ll leave this rant to another time and now I’ll just say that maybe because of this class solidarity a lot of French companies feel well on russian market and have no intent of leaving it soon. And of course the current president managed to create a new verb—”to macron”, meaning expressing a deep concern without doing anything substantial (of course various bureaucratic organisations like United Nations have been practising it since ages but his behaviour was worth commending).

War is a terrible thing that I don’t wish to anybody beside those who actively call it onto others. But sometimes it’s the only harsh enough measure to make people (and countries) wake up from delusions and start doing something. But as you can see, some still learn nothing from it.

Indeo 3: cell coding

January 30th, 2023

So we partitioned out the frame and now have to code the cell data. How to pick the best parameters in this case?

The patent suggest calculating vertical and horizontal differences (i.e. differences between top-bottom and left-right neighbours) and depending on how large they are select one of the modes. Codebook selection is not reviewed at all. The reference encoder calculates those differences and uses them to set both cell coding mode and select codebook. I.e. if both differences are large use mode 0 (fine-grained coding), if only one difference is large use mode 3 or 11, otherwise use mode 10. And a ratio of the differences is clipped, multiplied by a magic factor, then by a rate control factor and used as an index in a special magic table to select codebook.

Since my goal is to learn something new instead of simply replicating something existing, I took a completely different approach (that should contain less magic). Mode selection is done by comparing differences and amending it if I decide to use two codebooks. I used the fact that first eight codebooks mostly have differences in form kN+1 and the next eight codebooks have differences in form kM. So I simply calculate for each codebook how many delta values are represented best with those formulas and select the best fitting one. Also I calculate it separately for the even and odd lines (the histograms can be merged later to give a total statistics) so I can select the appropriate codebook or codebook pair for the coding mode. Maybe I’ll have to adjust the scheme for the rate control but it’ll happen later. Side note: Indeo 3 specifies a per-frame set of 16 codebook pairs that all cells should use and global codebook index offset so single-codebook modes may use additional 5 codebooks; the set seems to be static and has regular structure and I’m not sure that global codebook index offset is ever used.

That’s it. The rest of the things should be rather trivial: I’ve written how to perform motion search before, rate/quality control has never been great in the original codec (maybe I’ll report how I did it when I get to it), zero run compression is nothing special either. There’s not much to write until I fix some bugs, improve compression, introduce rate control and validate it against the reference decoder. And that will take a long time…

Indeo 3: splitting the frame

January 28th, 2023

As mentioned in the previous post, Indeo 3 splits frame into cells using binary trees and they’re coded using one of several possible modes. In reality it’s more complex: there’s a primary tree that splits frame into regions and tells how to code them (intra or inter) and those regions themselves can be split using another binary tree to tell which coding method to use (or to skip decoding it entirely). See, it had tree coding, prediction units and coding units two decades before H.265! And slices as well: it divides data into strips 160 pixels wide too.

Splitting the frame optimally is practically impossible task (because of its combinatorial complexity). In reality though it’s much simpler: first we split plane into 160-pixel wide (or 40-pixel wide for chroma) strips then split them along the largest dimension until we get cells of maximum acceptable size (which seems to be 767 pixels but the encoder seems to handle up to 2048 pixels in a coded cell). Then it’s up to a secondary cell coding.

From what I could gather in the encoder, it also tries to split secondary cells if they’re above the limit but it’s the same value used in the reference encoder even if it could be set separately.

Since my goal is to learn something new instead of re-creating something existing, I use a different approach: initial mode is selected by the relation between horizontal and vertical differences (if both are too high I try to split the cell and try again). Similarly for inter mode I first try to see whether the cell can be coded as inter (and if splitting it will make at least one of the sub-cells code as inter) and if not then I resort to intra coding.

There is probably a better way than brute force to find out the optimal splitting but for lack of it a simple heuristic should do.

Cell coding mode and codebook selection is a topic best left for the next time.