VP6 encoder design

October 2nd, 2021

This is the penultimate post in the series (there shall be another post, on how to use the encoder—but if there’s no interest I can simply skip it making this the last post in the series). As promised before, here I’ll present the layout and the details of my encoder.
Read the rest of this entry »

Is VP8 a Duck codec?

October 1st, 2021

There’s a blog out there with posts dedicated to the history of On2 (née Duck). And one particular post (archived version) brought an unsettling thought that refuses to leave me. Does VP8 belong to Duck or Baidu (yes, I’ll keep calling this company by value) codecs?

Arguments for Duck theory:

  1. it was released in 2008, before acquisition (which happened in 2010);
  2. it can be seen as an improvement of VP7, which is definitely a Duck codec;
  3. its documentation is as lacking as for the previous codecs.

Arguments for Baidu theory:

  1. it became famous after the company was bought and the codec was open-sourced;
  2. as a follow-up from the previous item, there is an open-source library for decoding and encoding it (I think the previous source dump had an encoder just for TMRT and maybe it was an oversight);
  3. it has its own ecosystem (all previous codecs were stored in AVI, this one uses WebMKV);
  4. I don’t have to implement it in NihAV (because I wanted nihav_duck crate to contain decoders for all Duck formats and if VP8 is not really a Duck codec I don’t have to do anything).

So, what do you think?

VP6 — rate control and rate-distortion optimisation

September 30th, 2021

First of all, I want to warn you that “optimisation” part of RDO comes from mathematics with its meaning being selecting an element which satisfies certain criteria the best. Normally we talk about optimisation as a way for code to run faster but the term has more general meaning and here’s one of such cases.

Anyway, while there is a lot of theory behind it, the concepts are quite simple (see this description from a RAD guy for a short concise explanation). To put it oversimplified, rate control is the part of an encoder that makes it output stream with the certain parameters (i.e. certain average bitrate, limited maximum frame size and such) and RDO is a way to adjust encoded stream by deciding how much you want to trade bits for quality in this particular case.

For example, if you want to decide which kind of macroblock you want to encode (intra or several kinds of inter) you calculate how much the coded blocks differ from the original one (that’s distortion) and add the cost of coding those blocks (aka rate) multiplied by lambda (which is our weight parameter that tells how much to prefer rate over distortion or vice versa). So you want to increase bitrate? Decrease lambda so fidelity matters more. You want to decrease frame size? Increase lambda so bits are more important. From mathematical point of view the problem is solved, from implementation point of view that’s where the actual problems start.
Read the rest of this entry »

VP6 encoder: done!

September 29th, 2021

Today I’ve finished work on my VP6 encoder for NihAV and it seems to work as expected (which means poorly but what else to expect from a failure). Unfortunately even if the encoder is complete from my point of view, there are still some things to do: write a couple of posts on rate control/RDO and the overall design of my encoder and make it more useful for the people brave enough to use it in e.g. Red Alert game series modding. That means adding some input format support useful for the encoder (I’ve hacked Y4M input support but if there’s a request for a lossless codec in AVI, I can implement that too) and write a page describing how to use nihav-encoder to encode content in VP6 format (AVI only, maybe I’ll add FLV later as a joke but FLV decoding support should come first).

And now I’d like to talk about what features my encoder has and why it lacks in some areas.

First, what it has:

  • all macroblock types are supported (including 4MV and those referencing golden frame);
  • custom models updated per frame;
  • Huffman encoding mode;
  • proper quarterpel motion estimation;
  • extremely simple golden frame selection;
  • sophisticated macroblock type selection process;
  • rudimentary rate control.

In other words, it can encode a stream having all but just a couple of features and with the varying quality as well.

And what it doesn’t have:

  • interlacing! It should not be that hard to add but my principles say no to supporting it at all (except in some decoders where it can’t be avoided);
  • alpha support—it’s rather easy to add but there’s little use for it;
  • custom scan order—it’s not likely to give a significant gain while it’s quite hairy to implement properly (it’s not that complex per se but it’ll need a lot of debugging to get it right because of its internal representation);
  • advanced profile features like bicubic interpolation filters and selecting parameters for it (again, too much work too little fun);
  • context-dependent macroblock size approximations (i.e. calculate expected size using the information about already selected preceding macroblocks instead of fixed guesstimates);
  • better macroblock and frame size approximations in general (more about it in the upcoming post);
  • better golden frame selection (I don’t even know what would be a good condition for that);
  • dynamic intra frame selection (i.e. code a frame as I-frame where it’s appropriate instead of each N-th frame);
  • proper rate control (this should be discussed in the upcoming post).

This is an example of progressive approach to the development (in the same sense as progressive JPEG coding): first you implement rough approximation of what you want to have and keep on expanding and improving various features until some arbitrary limit is reached. A lot of the features that I’ve not implemented properly need a lot of time (and sometimes significant domain-specific knowledge) for a proper implementation so I simply stopped where it was either working good enough or it would be not fun to continue.


So, with the next couple of posts on still not covered details (RDO+rate control and overall design) the journey should be complete. Remember, it’s the best opensource VP6 encoder (for the lack of competition) and since I’ve managed to make something resembling an encoder, maybe you can write something even better?

How to perform fast motion search

September 25th, 2021

To answer the obvious question with the obvious answer, brute force searching for a decent motion vector takes insanely large time. For example, VP6 motion search area be up to 63×63 pixels and checking all possible positions there requires a lot of tries. And if you remember that VP6 has quarterpel motion compensation precision, you should multiply that number by 16 possible sub-pixel positions. Obviously in order to reduce the number of tries various tricks are employed.

While by itself fast motion search methods I describe here are not that complex, it was rather hard to locate books where such details of developing video encoders are presented. At last I’ve found two or three books with the chapters dedicated to motion compensation plus the papers referenced there. The results of this mini-research are given below.
Read the rest of this entry »

VP6 — interframe encoder done, what’s next?

September 23rd, 2021

I’ve finally finished implementing the rest of the features required for interframes: motion estimation, previous or golden frame selection (along with golden frame itself), four motion vectors per macroblock are finally supported. How I implemented fast motion search deserves a separate post that I hope to write at the weekend, the rest of things should be in this post.
Read the rest of this entry »

VP6 — simple interframe encoder done

September 18th, 2021

As I said in the previous post detailing the roadmap, there’s a lot to do for an interframe encoder. Now I have the basics implemented but there’s a lot more to do.
Read the rest of this entry »

VP6 — interframe encoder roadmap

September 11th, 2021

Before I start working on I’d like to summarise things that should be done for interframe encoding.
Read the rest of this entry »

VP6 — simple intraframe encoder, part 2

September 10th, 2021

At last I have a working intraframe VP6 encoder. And the encoded data is decoded fine by the reference decoder as well as by open-source ones. So here I’ll describe what I had to do in order to achieve that result.
Read the rest of this entry »

VP6 — simple intraframe encoder, part 1

September 5th, 2021

I admit that I haven’t spent much time on writing encoder but I still have some progress to report.
Read the rest of this entry »