Looking at RoQ

October 27th, 2021

Recently The (Multimedia) Mike contacted me and asked if I can look what’s wrong with Clandestiny videos. I did.

First of all it’s worth mentioning that the format obviously originates from Trilobyte as it’d been used in its games years before Quake III was released, it has more features and the decoder in open-sourced Q3 engine still calls it trFMV.

Then I should mention that RoQ support in Libav is not great (and in FFmpeg it’s exactly the same) as the demuxer lacks support for some packet types (like 0x1030 used to signal that it’s a good time to prefetch data), there’s no support for JPEG frames and it goes crazy on files extracted from Clandestiny demo because of all that.

Thus I decided to quickly hack my own decoder for it based on the original description by Dr. Tim Ferguson (yet another forgotten researcher who REd several VQ-based video formats) and played with it to see what’s going wrong.

And it seems the problem was mostly in motion compensation. In some conditions you should double the motion vectors (I think it’s when the first chunk size is zero instead of minus one); also some files have alpha information in the codebook (this is detected by video properties chunk argument being set to one) as it’s apparent from ScummVM code. In either case it’s just the minor details that make things complicated (and I was lucky not to encounter interlaced mode files).

It was a nice distraction but I guess it’s time to do something else.

NihAV: now with lossless audio encoder

October 26th, 2021

Since I wanted to try my hoof at various encoding concepts it’s no wonder that after lossy audio encoder (IMA ADPCM with trellis encoding), lossless video encoder (ZMBV, using my own deflate implementation for compressing), lossy video encoder (Cinepak, for playing with vector quantisation, and VP6, for playing with many other concepts) it was time for a lossless audio encoder.

To remind you, there are essentially two types of lossless audio compressors—fast and asymmetric (based on LPC filters) and slow and symmetric (based on adaptive filters, usually long LMS ones). The theory behind them is rather simple and described below.
Read the rest of this entry »

VP8: dubious decisions, deficiencies and outright idiocy

October 15th, 2021

I’ve finally finished VP8 decoder for NihAV (which was done mainly by hacking already existing VP7 decoder) and I have some unpleasant words to say about VP8. If you want to read praises to the first modern open-source patent-free video codec (and essentially the second one since VP3/Theora) then go and read any piece of news from 2011. Here I present my experience implementing the format and what I found not so good or outright bad about the “specification” and the format itself.

Read the rest of this entry »

VP8: specification analysis

October 8th, 2021

In a recent post titled Is VP8 a Duck codec? the majority (both commenters) decided it’s a Duck codec after all so I’ll have to implement a decoder for it in NihAV. Back in the day Jason from x264 looked at it from his perspective and found it inferior in most parts to H.264 (and rightfully so). That post was the most popular on multimedia.cx ever since Steve Jobs replied with a link to it once. But since those days too many things have changed, there’s no Jobs, there’s no Jason, his blog is deleted and all you can find is an archived copy. And now it’s my turn to look at VP8 and see how it fares against other codecs I know.

And of course I start with its specification.
Read the rest of this entry »

VP6 encoding guide

October 6th, 2021

As I wanted to do before, I’ve written a short guide on how to encode VP6 to FLV. You can find it here, at NihAV site.

You should be able to encode raw video into VP6 in AVI or (with a slightly custom build) to VP6 in EA format (if you want to test if the encoder is good enough for modding purposes; but I guess even Peter Ross won’t care about that). As usual, it’s not guaranteed to work but it seems to work for me.

And that should be it. I might do VP7 encoder later (much later!) just for lulz but so far I can see way more interesting things to do (more formats to decode, lossless audio encoder and such).

A look at another game codec

October 4th, 2021

This morning The Multimedia Mike told me he’s found yet another undiscovered game video codec used in some games by Imagination Pilots (probably it’s in no way related to the fact the codec has FOURCCs IPMA and IP20). Surprisingly there’s a fandom that has REd most of the game formats and made a reasonable assumption that the codec would use LZW in the same way as image resources do.

And it turns out they were right. Frames in both versions of the codec are raw image buffers compressed with LZW. Pixel value 0 is used there for transparency and considering that those animations may use a transparent background as well there’s no difference between frame types at all.

So people who REd the rest of the formats just missed the final step and if they had frame extractor they could simply dump frame data, try to decompress it and see the result. You don’t need to look inside the binary specification for that (I did and it was not that useful even if I could recognize LZW decompression functions there).

The rest of the post I want to dedicate to ranting about Ghidra failing to decompile it properly.
Read the rest of this entry »

VP6 encoder design

October 2nd, 2021

This is the penultimate post in the series (there shall be another post, on how to use the encoder—but if there’s no interest I can simply skip it making this the last post in the series). As promised before, here I’ll present the layout and the details of my encoder.
Read the rest of this entry »

Is VP8 a Duck codec?

October 1st, 2021

There’s a blog out there with posts dedicated to the history of On2 (née Duck). And one particular post (archived version) brought an unsettling thought that refuses to leave me. Does VP8 belong to Duck or Baidu (yes, I’ll keep calling this company by value) codecs?

Arguments for Duck theory:

  1. it was released in 2008, before acquisition (which happened in 2010);
  2. it can be seen as an improvement of VP7, which is definitely a Duck codec;
  3. its documentation is as lacking as for the previous codecs.

Arguments for Baidu theory:

  1. it became famous after the company was bought and the codec was open-sourced;
  2. as a follow-up from the previous item, there is an open-source library for decoding and encoding it (I think the previous source dump had an encoder just for TMRT and maybe it was an oversight);
  3. it has its own ecosystem (all previous codecs were stored in AVI, this one uses WebMKV);
  4. I don’t have to implement it in NihAV (because I wanted nihav_duck crate to contain decoders for all Duck formats and if VP8 is not really a Duck codec I don’t have to do anything).

So, what do you think?

VP6 — rate control and rate-distortion optimisation

September 30th, 2021

First of all, I want to warn you that “optimisation” part of RDO comes from mathematics with its meaning being selecting an element which satisfies certain criteria the best. Normally we talk about optimisation as a way for code to run faster but the term has more general meaning and here’s one of such cases.

Anyway, while there is a lot of theory behind it, the concepts are quite simple (see this description from a RAD guy for a short concise explanation). To put it oversimplified, rate control is the part of an encoder that makes it output stream with the certain parameters (i.e. certain average bitrate, limited maximum frame size and such) and RDO is a way to adjust encoded stream by deciding how much you want to trade bits for quality in this particular case.

For example, if you want to decide which kind of macroblock you want to encode (intra or several kinds of inter) you calculate how much the coded blocks differ from the original one (that’s distortion) and add the cost of coding those blocks (aka rate) multiplied by lambda (which is our weight parameter that tells how much to prefer rate over distortion or vice versa). So you want to increase bitrate? Decrease lambda so fidelity matters more. You want to decrease frame size? Increase lambda so bits are more important. From mathematical point of view the problem is solved, from implementation point of view that’s where the actual problems start.
Read the rest of this entry »

VP6 encoder: done!

September 29th, 2021

Today I’ve finished work on my VP6 encoder for NihAV and it seems to work as expected (which means poorly but what else to expect from a failure). Unfortunately even if the encoder is complete from my point of view, there are still some things to do: write a couple of posts on rate control/RDO and the overall design of my encoder and make it more useful for the people brave enough to use it in e.g. Red Alert game series modding. That means adding some input format support useful for the encoder (I’ve hacked Y4M input support but if there’s a request for a lossless codec in AVI, I can implement that too) and write a page describing how to use nihav-encoder to encode content in VP6 format (AVI only, maybe I’ll add FLV later as a joke but FLV decoding support should come first).

And now I’d like to talk about what features my encoder has and why it lacks in some areas.

First, what it has:

  • all macroblock types are supported (including 4MV and those referencing golden frame);
  • custom models updated per frame;
  • Huffman encoding mode;
  • proper quarterpel motion estimation;
  • extremely simple golden frame selection;
  • sophisticated macroblock type selection process;
  • rudimentary rate control.

In other words, it can encode a stream having all but just a couple of features and with the varying quality as well.

And what it doesn’t have:

  • interlacing! It should not be that hard to add but my principles say no to supporting it at all (except in some decoders where it can’t be avoided);
  • alpha support—it’s rather easy to add but there’s little use for it;
  • custom scan order—it’s not likely to give a significant gain while it’s quite hairy to implement properly (it’s not that complex per se but it’ll need a lot of debugging to get it right because of its internal representation);
  • advanced profile features like bicubic interpolation filters and selecting parameters for it (again, too much work too little fun);
  • context-dependent macroblock size approximations (i.e. calculate expected size using the information about already selected preceding macroblocks instead of fixed guesstimates);
  • better macroblock and frame size approximations in general (more about it in the upcoming post);
  • better golden frame selection (I don’t even know what would be a good condition for that);
  • dynamic intra frame selection (i.e. code a frame as I-frame where it’s appropriate instead of each N-th frame);
  • proper rate control (this should be discussed in the upcoming post).

This is an example of progressive approach to the development (in the same sense as progressive JPEG coding): first you implement rough approximation of what you want to have and keep on expanding and improving various features until some arbitrary limit is reached. A lot of the features that I’ve not implemented properly need a lot of time (and sometimes significant domain-specific knowledge) for a proper implementation so I simply stopped where it was either working good enough or it would be not fun to continue.


So, with the next couple of posts on still not covered details (RDO+rate control and overall design) the journey should be complete. Remember, it’s the best opensource VP6 encoder (for the lack of competition) and since I’ve managed to make something resembling an encoder, maybe you can write something even better?