Archive for the ‘TrueMotion’ Category

A failed attempts on writing Duck TrueMotion S encoder

Thursday, February 23rd, 2023

So, my attempt to write a semi-decent TrueMotion 1 encoder has failed (mostly because I’m too annoyed to continue it). Here I’ll describe how the codec works and what I’ve implemented.

In essence, Horizons Technology PVEZ is a rather simple delta-compression based codec that codes RGB15 (or ARGB20) and RGB24 using prediction from three neighbours and Tunstall codes (I’m aware only of one other codec, CRI P256, that employs it). For convenience there are three possible fixed sets of deltas and three fixed codebooks as well. Videos from (3DO version IIRC) Star Control II: The Ur-Quan Masters used custom codebooks (data for cutscenes was stored in several files and one of them was the codebook specification), and later TM2X allowed using per-frame custom codebooks and deltas but nobody remembers it. The second revision of the codec (do not confuse it with TrueMotion 2 though) introduced inter frames where some of the 2×4 blocks could be marked as skipped.

Initially I had no idea on how to do it properly so I tried brute forcing it by creating a search tree limited to maximum of 256 nodes at each level but as you can expect it took about a minute to encode two frames in VGA resolution. Thus I decided to look at the codebook closer and eventually found out that it’s a prefix one (i.e. for each chain of codes there’s its non-empty prefix in the codebook as well) so I can use greedy approach by simply accumulating codes in a sequence and writing codebook entry ID when the sequence can’t be extended further (or when adding the next code forms a sequence not in the codebook). Which leaves the question of deltas.

There are two kinds of deltas there, both occurring in pairs: C-deltas that update red and blue components (depending on coding parameters there may be 1-4 deltas per 2×4 block of pixels) and Y-deltas that update all pixel components (and for all pixels as well). The problem here was to keep deltas in order so they produce sane pixel values (i.e. without wraparounds) and that’s where I failed. I used the same approach as the decoders and grouped delta pairs into single value. The problem is that I could not keep the result value from causing overflows even if I tried all possible deltas and did not check C-deltas result (as Y-deltas are added immediately after that). And I also made a mistake of using pixel value with its components stored separately (the deltas apparently exploit carries and subtracting with borrows for higher components). I suppose I’d have better luck if I use 32-bit pixel value (converting it to bytes for checking the differences and such) and if I use individual deltas and probably with a trellis search for 4-8 deltas to make sure the result does not deviate much from the original pixel values…—but I was annoyed enough at this point so I simply gave up. And that is before getting to the stage when I have to figure out how to select delta values set (probably just calculate the deltas for the whole frame and see what set fits there the best), what codebook to pick and how to control bitrate (by zeroing small deltas?).

Oh well, I’m sure I’ll find something else to work at.

P.S. I’ve also tried to look at the reference encoder but CODUCK.DLL was not merely a horrible pun but an obfuscated (you were supposed to pay for the encoder and use serial numbers and such after all) 16-bit code that made Ghidra decompiler commit suicide so I gave up on it as well.

P.P.S. I might return to it one day but it seems unlikely as this codec is not that interesting or useful for me.

Visiting multimedia grave

Monday, October 31st, 2022

When people ask why I call the search division of Alphabet Inc Baidu, I answer that I do it in spite to muddle their search index and mostly because they remind me of a Chinese totalitarian company. And recent news only reaffirm such views.

As you should remember, Baidu is famous for its graveyard for the killed projects—it even has a separate alley for the messenger apps. And looks like it prepares a plot under a concrete duck for burying some multimedia formats (which makes it interesting to me).

The history of multimedia formats at Baidu essentially started with the purchase of On2 and releasing VP8 in WebMKV format. Then VP8 was mostly buried since VP9 was created (some of it remains hidden inside WebP format), VP9’s turn is near since VP10 is here to succeed it (under the name of AV1).

In the recent news though it turns out that Chrome is deprecating its support for JPEG XL, a format developed mostly at Baidu and the only one properly standardised. But as we all know, Chrome currently controls the Web and removing support for it means that the format will remain obscure. Kinda like in a Soviet joke where a foreign tourist asks in a shop why there’s no caviar and hears that there’s no demand for it—and as he observed for a whole day nobody asked for it indeed (in case it’s not obvious people in the USSR didn’t ask for caviar at the shops because they knew it would not be sold there; see also Baidu Stadia).

And because it was not enough, people spotted that WebP2 has changed its status to experimental, meaning that it won’t be supported either.

So we have, VP9 buried in favour of AV1, JPEG XL being buried in favour of AV1F, WebP2 being buried in favour of AV1F (which is AV1 still frames in MP4) and the original WebP is likely to follow the suit. Now consider that AV1 is recommended to be distributed inside MP4 instead of WebMKV and you’ll fear about the future of that container as well.

I guess now all is left for them to do is to adopt Baidu Lyra as non-experimental codec to purge Vorbis and Opus not created by them and then bury it in favour of AV1-based audio compression. That would make a nice collective grave of formats killed by Baidu to make space for AV1.

So, do you know when AV2 should arrive?

vp6enc: slightly faster encoding mode

Saturday, March 12th, 2022

As I mentioned before, I wanted to try to apply the macroblock selection approach from VP7 encoder in VP6 encoder. Well, it was easy to implement: instead of preparing all macroblock modes and then trying which is the best one now it tries macroblock modes (starting with inter mode now) and stops when the result is good enough. In this mode encoding seems to go couple percent faster and the resulting size at the same quantiser can differ somewhat in both directions. You can try it yourself by using fast encoder option.

The encoder is still a failure though.

VP7 encoder release

Thursday, March 3rd, 2022

While a certain country cosplays the Third Reich and conducts talvisota simultaneously—and tries to bomb my home city to debris, I still need some distraction…

borrowed from

Since I’m rather bored with VP7 encoder I’ve decided to release it and move to something else. It should work about the same as VP6 encoder (i.e. poorly and nobody should care about it) but if you want to know what knobs you can turn just invoke nihav-encoder --query-encoder-options vp7 (but I guess the only useful options are to set bitrate/quantiser and keyframe interval).

Have fun!

Update from March 4: encoding with low quantisers should now work as well.

VP7 encoder: various bits

Sunday, February 27th, 2022

As the world tries to avert attention from an insane dictator re-enacting 1939 (it gets funnier since I observe it from Germany), I should also do something to take my mind off constant worrying about my parents and other relatives in one of the Ukrainian cities under attack. Hence this significantly less unpleasant thing.

Now my encoder is conceptually done, all that is left to do is to fix a leftover bug or two, improve a thing of two, clean the code up and integrate it nicely with the rest of nihav-duck crate by splitting off common parts with VP6 encoder. Meanwhile I can talk about some things implemented since the last time and what wasn’t.

Basic VP7 encoder: cutting corners

Thursday, February 17th, 2022

I’ve more or less completed a basic failure of VP7 encoder. Now it can encode inter-frames using various size of motion compensation (and the resulting file can be decoded too!). There’s still a lot of work to be done (rate control, MB features and multiple frame analysis) but there are some things that I can talk about as well.

As I wrote in the previous post, there are too many coding parameters to try so if you want to have a reasonable encoding done in reasonable time you need to cut corners (or “employ heuristics” if you want to sound more scientific) in various ways. So here I want to present what has been done in my decoder to make it run fast.

VP7 encoding: general principles

Sunday, January 30th, 2022

It is not that hard to write a simple encoder (as I’m going to demonstrate), the problem is to make it good (and that’s where I’ll fail). Until that time I’m going to explain what I’m doing and how/why it should be done.

Starting work on VP7 encoder

Wednesday, January 26th, 2022

As I said in the previous post, currently I don’t have any features or decoders to add to NihAV (because Paul has not finished his work on Bink2 decoder yet) beside some encoders that nobody will use.

Thus I decided to work on something more advanced than VP6 that allows me to play with more advanced features (like EPZS motion estimation, per macroblock quantiser selection and such). For that I needed to pick some codec probably based on H.264 and there was not that much to pick from:

  • ITU H.264—first and foremost, I don’t have a properly working decoder for it (yet?); second, the format is too complex so just thinking about writing all those SPSes, PPSes and various lists discourages me from even considering to write an encoder for it;
  • RealVideo 3 or 4—tempting but that means I also need to write a RealMedia muxer and the format lacks dquant (in theory it’s supported, in practice it’s never happened). Maybe one day I’ll make my own NihAV-Really? encoder for RV3+Cooker but not today;
  • Sorenson SVQ3—same problems essentially;
  • VP8—Mike has done it over a decade ago;
  • VX—this is a custom game codec which is simplified (even quantiser is rather implicit).

The rough roadmap is the following:

  1. make intra-only encoder that encodes picture somehow;
  2. improve it to select the best whole macroblock prediction mode;
  3. add 4×4 prediction mode and make it select the best mode;
  4. add inter-frame support along with motion compensation;
  5. add EPZS-based motion estimation;
  6. introduce rough motion search for group of frames to determine good golden frame candidate and the macroblocks that should be coded with higher quality;
  7. actually code those macroblocks with higher quality using MB features;
  8. use trellis-based quantiser search for improved coding of frames;
  9. speed it up by using various heuristics instead of brute force search for coding parameters.

This should take some time…

VP8: dubious decisions, deficiencies and outright idiocy

Friday, October 15th, 2021

I’ve finally finished VP8 decoder for NihAV (which was done mainly by hacking already existing VP7 decoder) and I have some unpleasant words to say about VP8. If you want to read praises to the first modern open-source patent-free video codec (and essentially the second one since VP3/Theora) then go and read any piece of news from 2011. Here I present my experience implementing the format and what I found not so good or outright bad about the “specification” and the format itself.


VP8: specification analysis

Friday, October 8th, 2021

In a recent post titled Is VP8 a Duck codec? the majority (both commenters) decided it’s a Duck codec after all so I’ll have to implement a decoder for it in NihAV. Back in the day Jason from x264 looked at it from his perspective and found it inferior in most parts to H.264 (and rightfully so). That post was the most popular on ever since Steve Jobs replied with a link to it once. But since those days too many things have changed, there’s no Jobs, there’s no Jason, his blog is deleted and all you can find is an archived copy. And now it’s my turn to look at VP8 and see how it fares against other codecs I know.

And of course I start with its specification.