At last I have a working intraframe VP6 encoder. And the encoded data is decoded fine by the reference decoder as well as by open-source ones. So here I’ll describe what I had to do in order to achieve that result.
First, I had to fix some bugs, mostly related to overflow corner case handling and multistream interpretation in case of simple profile. This made my decoder produce conformant but not optimal stream.
Then I moved to an optimal model generation. It is done by trying to encode block data in the same way as you do it later but instead of encoding bools you simply count them (and skipping all operations that should write data with fixed probability). After that you convert those counts into probabilities, send the models and encode the rest of data using them.
My test sample encoded using default models takes 239846 bytes, the same file encoded with models adapted for each frame takes 282308 bytes. Quite a surprise, isn’t it? Of course it happens because the overhead for storing model probabilities is larger than the savings for the model. The solution to that is obvious: you don’t store the probabilities that do not lead to the savings (and VP5 and later each probability is updated individually except for the calculated ones of course). Simply compare the amount of bits it takes to encode data using old and new probabilities, add 7 bits needed to code new probability itself and that’s it. Calculating the amount of bits is trivial too: -log2(zero_prob/256) * num_zeroes + -log2(1.0 - zero_prob/256) * num_ones
(I simply took a pre-calculated table for all 256 possible probabilities with 1/8th of bit precision that I used in my earlier experiments, the amount of encoded bits is stored in the counters used to determine probability). With that the output file shrank down to 218830 bytes, a noticeable gain in compression from the original 239846 bytes.
And finally I made it reconstruct the encoded frame. Currently it is useless but it will be used for encoding inter frames. This step is trivial so I don’t have to describe it.
What’s still left to do? Quantiser selection (I intend to deal with it later during experiments with RDO and rate control) and custom scans (which I probably shan’t touch at all). Maybe also Huffman coding for coefficient data but that’s rather easy and not particularly interesting.
Anyway, any idiot can write an intra-only encoder, the main challenge lies in efficient inter frame encoding and I’m moving to that.
Will you implement famous B frames?
If the authors didn’t bother with that then why should I?
I’ll try to do something about golden frame selection though.
[…] are encoded as updates (and only when coding an update saves bits, see this post for the […]