Archive for the ‘TrueMotion’ Category

TM1 encoder: probably done

Wednesday, July 19th, 2023

After some trial I decided to release what I’ve done and probably not return to it ever again.

Currently my encoder can encode 15-bit TrueMotion 1 format with different block sizes. It’s probably not very adjustable but there’s not that much to adjust really. I’ll talk why I gave up on 24-bit mode (again!) below, for the other options here’s a condensed version: it does not matter. I’ve tried encoding files with an alternative delta set and it resulted in significantly worse picture quality (but at least encoded frames were usually larger as well); as I mentioned in the previous post, only the first codebook makes sense for 15-bit data (as other two codebooks waste space on coding delta value 7 which is not used in 15-bit mode). Inter mode uses simple skip block as I didn’t bother to think about the possible threshold but it works good enough anyway. In theory I could calculate gradients to determine what sub-block sizes to use for each frame (as I did in Indeo 3 encoder) but again, I decided not to bother.

Now, here are the reasons why 24-bit mode is much harder. For 15-bit mode you can easily calculate deltas for each (decorrelated) component independently rather easily—and the coding method allows selecting deltas in fine-grained way too. In 24-bit mode you have chroma delta pair that updates red and blue components and luma delta pair that updates red component with one value and green and blue components together with another value. In theory decorrelating just green and blue components should help but there we hit another issue: the amount of possible deltas is good enough to represent different delta values occurring during the prediction stage. Essentially you can’t process each component independently and should rather apply deltas as 32-bit values to the 32-bit pixel value, then unpack it and see that the individual components aren’t far enough from the desired ones. It is not that hard to implement but it essentially means writing a second TrueMotion 1 encoder that processes 24-bit data in an entirely different way. Considering its limited use and the fact that it shrinks down horizontal resolution in two times—the coduck (that’s their very original name for it) always processes blocks of two 32-bit words but now those are two 24-bit pixels instead of four 15-bit ones. In either case, even if I see how it should be solved I’m not going to actually do it.

I need to find myself a better task to undertake.

Restarting the work on TM1 encoder

Saturday, July 15th, 2023

Back in February I wrote about my failed attempt to write TrueMotion 1 encoder. And since I was bored and really had nothing better to do, I tried my hoof at it again.

Last time it was 24-bit encoding, now I tried to approach 15-bit encoding instead and got some results. I guess the moral of the story is that you should not overthink it and use the simplest approach to coding.
(more…)

A failed attempts on writing Duck TrueMotion S encoder

Thursday, February 23rd, 2023

So, my attempt to write a semi-decent TrueMotion 1 encoder has failed (mostly because I’m too annoyed to continue it). Here I’ll describe how the codec works and what I’ve implemented.

In essence, Horizons Technology PVEZ is a rather simple delta-compression based codec that codes RGB15 (or ARGB20) and RGB24 using prediction from three neighbours and Tunstall codes (I’m aware only of one other codec, CRI P256, that employs it). For convenience there are three possible fixed sets of deltas and three fixed codebooks as well. Videos from (3DO version IIRC) Star Control II: The Ur-Quan Masters used custom codebooks (data for cutscenes was stored in several files and one of them was the codebook specification), and later TM2X allowed using per-frame custom codebooks and deltas but nobody remembers it. The second revision of the codec (do not confuse it with TrueMotion 2 though) introduced inter frames where some of the 2×4 blocks could be marked as skipped.

Initially I had no idea on how to do it properly so I tried brute forcing it by creating a search tree limited to maximum of 256 nodes at each level but as you can expect it took about a minute to encode two frames in VGA resolution. Thus I decided to look at the codebook closer and eventually found out that it’s a prefix one (i.e. for each chain of codes there’s its non-empty prefix in the codebook as well) so I can use greedy approach by simply accumulating codes in a sequence and writing codebook entry ID when the sequence can’t be extended further (or when adding the next code forms a sequence not in the codebook). Which leaves the question of deltas.

There are two kinds of deltas there, both occurring in pairs: C-deltas that update red and blue components (depending on coding parameters there may be 1-4 deltas per 2×4 block of pixels) and Y-deltas that update all pixel components (and for all pixels as well). The problem here was to keep deltas in order so they produce sane pixel values (i.e. without wraparounds) and that’s where I failed. I used the same approach as the decoders and grouped delta pairs into single value. The problem is that I could not keep the result value from causing overflows even if I tried all possible deltas and did not check C-deltas result (as Y-deltas are added immediately after that). And I also made a mistake of using pixel value with its components stored separately (the deltas apparently exploit carries and subtracting with borrows for higher components). I suppose I’d have better luck if I use 32-bit pixel value (converting it to bytes for checking the differences and such) and if I use individual deltas and probably with a trellis search for 4-8 deltas to make sure the result does not deviate much from the original pixel values…—but I was annoyed enough at this point so I simply gave up. And that is before getting to the stage when I have to figure out how to select delta values set (probably just calculate the deltas for the whole frame and see what set fits there the best), what codebook to pick and how to control bitrate (by zeroing small deltas?).

Oh well, I’m sure I’ll find something else to work at.

P.S. I’ve also tried to look at the reference encoder but CODUCK.DLL was not merely a horrible pun but an obfuscated (you were supposed to pay for the encoder and use serial numbers and such after all) 16-bit code that made Ghidra decompiler commit suicide so I gave up on it as well.

P.P.S. I might return to it one day but it seems unlikely as this codec is not that interesting or useful for me.

Visiting multimedia grave

Monday, October 31st, 2022

When people ask why I call the search division of Alphabet Inc Baidu, I answer that I do it in spite to muddle their search index and mostly because they remind me of a Chinese totalitarian company. And recent news only reaffirm such views.

As you should remember, Baidu is famous for its graveyard for the killed projects—it even has a separate alley for the messenger apps. And looks like it prepares a plot under a concrete duck for burying some multimedia formats (which makes it interesting to me).

The history of multimedia formats at Baidu essentially started with the purchase of On2 and releasing VP8 in WebMKV format. Then VP8 was mostly buried since VP9 was created (some of it remains hidden inside WebP format), VP9’s turn is near since VP10 is here to succeed it (under the name of AV1).

In the recent news though it turns out that Chrome is deprecating its support for JPEG XL, a format developed mostly at Baidu and the only one properly standardised. But as we all know, Chrome currently controls the Web and removing support for it means that the format will remain obscure. Kinda like in a Soviet joke where a foreign tourist asks in a shop why there’s no caviar and hears that there’s no demand for it—and as he observed for a whole day nobody asked for it indeed (in case it’s not obvious people in the USSR didn’t ask for caviar at the shops because they knew it would not be sold there; see also Baidu Stadia).

And because it was not enough, people spotted that WebP2 has changed its status to experimental, meaning that it won’t be supported either.

So we have, VP9 buried in favour of AV1, JPEG XL being buried in favour of AV1F, WebP2 being buried in favour of AV1F (which is AV1 still frames in MP4) and the original WebP is likely to follow the suit. Now consider that AV1 is recommended to be distributed inside MP4 instead of WebMKV and you’ll fear about the future of that container as well.

I guess now all is left for them to do is to adopt Baidu Lyra as non-experimental codec to purge Vorbis and Opus not created by them and then bury it in favour of AV1-based audio compression. That would make a nice collective grave of formats killed by Baidu to make space for AV1.

So, do you know when AV2 should arrive?

vp6enc: slightly faster encoding mode

Saturday, March 12th, 2022

As I mentioned before, I wanted to try to apply the macroblock selection approach from VP7 encoder in VP6 encoder. Well, it was easy to implement: instead of preparing all macroblock modes and then trying which is the best one now it tries macroblock modes (starting with inter mode now) and stops when the result is good enough. In this mode encoding seems to go couple percent faster and the resulting size at the same quantiser can differ somewhat in both directions. You can try it yourself by using fast encoder option.

The encoder is still a failure though.

VP7 encoder release

Thursday, March 3rd, 2022

While a certain country cosplays the Third Reich and conducts talvisota simultaneously—and tries to bomb my home city to debris, I still need some distraction…

borrowed from vp7.de

Since I’m rather bored with VP7 encoder I’ve decided to release it and move to something else. It should work about the same as VP6 encoder (i.e. poorly and nobody should care about it) but if you want to know what knobs you can turn just invoke nihav-encoder --query-encoder-options vp7 (but I guess the only useful options are to set bitrate/quantiser and keyframe interval).

Have fun!

Update from March 4: encoding with low quantisers should now work as well.

VP7 encoder: various bits

Sunday, February 27th, 2022

As the world tries to avert attention from an insane dictator re-enacting 1939 (it gets funnier since I observe it from Germany), I should also do something to take my mind off constant worrying about my parents and other relatives in one of the Ukrainian cities under attack. Hence this significantly less unpleasant thing.

Now my encoder is conceptually done, all that is left to do is to fix a leftover bug or two, improve a thing of two, clean the code up and integrate it nicely with the rest of nihav-duck crate by splitting off common parts with VP6 encoder. Meanwhile I can talk about some things implemented since the last time and what wasn’t.
(more…)

Basic VP7 encoder: cutting corners

Thursday, February 17th, 2022

I’ve more or less completed a basic failure of VP7 encoder. Now it can encode inter-frames using various size of motion compensation (and the resulting file can be decoded too!). There’s still a lot of work to be done (rate control, MB features and multiple frame analysis) but there are some things that I can talk about as well.

As I wrote in the previous post, there are too many coding parameters to try so if you want to have a reasonable encoding done in reasonable time you need to cut corners (or “employ heuristics” if you want to sound more scientific) in various ways. So here I want to present what has been done in my decoder to make it run fast.
(more…)

VP7 encoding: general principles

Sunday, January 30th, 2022

It is not that hard to write a simple encoder (as I’m going to demonstrate), the problem is to make it good (and that’s where I’ll fail). Until that time I’m going to explain what I’m doing and how/why it should be done.
(more…)

Starting work on VP7 encoder

Wednesday, January 26th, 2022

As I said in the previous post, currently I don’t have any features or decoders to add to NihAV (because Paul has not finished his work on Bink2 decoder yet) beside some encoders that nobody will use.

Thus I decided to work on something more advanced than VP6 that allows me to play with more advanced features (like EPZS motion estimation, per macroblock quantiser selection and such). For that I needed to pick some codec probably based on H.264 and there was not that much to pick from:

  • ITU H.264—first and foremost, I don’t have a properly working decoder for it (yet?); second, the format is too complex so just thinking about writing all those SPSes, PPSes and various lists discourages me from even considering to write an encoder for it;
  • RealVideo 3 or 4—tempting but that means I also need to write a RealMedia muxer and the format lacks dquant (in theory it’s supported, in practice it’s never happened). Maybe one day I’ll make my own NihAV-Really? encoder for RV3+Cooker but not today;
  • Sorenson SVQ3—same problems essentially;
  • VP8—Mike has done it over a decade ago;
  • VX—this is a custom game codec which is simplified (even quantiser is rather implicit).

The rough roadmap is the following:

  1. make intra-only encoder that encodes picture somehow;
  2. improve it to select the best whole macroblock prediction mode;
  3. add 4×4 prediction mode and make it select the best mode;
  4. add inter-frame support along with motion compensation;
  5. add EPZS-based motion estimation;
  6. introduce rough motion search for group of frames to determine good golden frame candidate and the macroblocks that should be coded with higher quality;
  7. actually code those macroblocks with higher quality using MB features;
  8. use trellis-based quantiser search for improved coding of frames;
  9. speed it up by using various heuristics instead of brute force search for coding parameters.

This should take some time…