Today I’ve finished work on my VP6 encoder for NihAV
and it seems to work as expected (which means poorly but what else to expect from a failure). Unfortunately even if the encoder is complete from my point of view, there are still some things to do: write a couple of posts on rate control/RDO and the overall design of my encoder and make it more useful for the people brave enough to use it in e.g. Red Alert game series modding. That means adding some input format support useful for the encoder (I’ve hacked Y4M input support but if there’s a request for a lossless codec in AVI, I can implement that too) and write a page describing how to use nihav-encoder
to encode content in VP6 format (AVI only, maybe I’ll add FLV later as a joke but FLV decoding support should come first).
And now I’d like to talk about what features my encoder has and why it lacks in some areas.
First, what it has:
- all macroblock types are supported (including 4MV and those referencing golden frame);
- custom models updated per frame;
- Huffman encoding mode;
- proper quarterpel motion estimation;
- extremely simple golden frame selection;
- sophisticated macroblock type selection process;
- rudimentary rate control.
In other words, it can encode a stream having all but just a couple of features and with the varying quality as well.
And what it doesn’t have:
- interlacing! It should not be that hard to add but my principles say no to supporting it at all (except in some decoders where it can’t be avoided);
- alpha support—it’s rather easy to add but there’s little use for it;
- custom scan order—it’s not likely to give a significant gain while it’s quite hairy to implement properly (it’s not that complex per se but it’ll need a lot of debugging to get it right because of its internal representation);
- advanced profile features like bicubic interpolation filters and selecting parameters for it (again, too much work too little fun);
- context-dependent macroblock size approximations (i.e. calculate expected size using the information about already selected preceding macroblocks instead of fixed guesstimates);
- better macroblock and frame size approximations in general (more about it in the upcoming post);
- better golden frame selection (I don’t even know what would be a good condition for that);
- dynamic intra frame selection (i.e. code a frame as I-frame where it’s appropriate instead of each N-th frame);
- proper rate control (this should be discussed in the upcoming post).
This is an example of progressive approach to the development (in the same sense as progressive JPEG coding): first you implement rough approximation of what you want to have and keep on expanding and improving various features until some arbitrary limit is reached. A lot of the features that I’ve not implemented properly need a lot of time (and sometimes significant domain-specific knowledge) for a proper implementation so I simply stopped where it was either working good enough or it would be not fun to continue.
So, with the next couple of posts on still not covered details (RDO+rate control and overall design) the journey should be complete. Remember, it’s the best opensource VP6 encoder (for the lack of competition) and since I’ve managed to make something resembling an encoder, maybe you can write something even better?