As I mentioned in the introductory post, there are nine block coding modes and my encoder tries them all to see which is good (or good enough) to be used. What I have not mentioned is that some of those blocks have sixteen variations (quantisers for DCT-based blocks and scan patterns for RLE blocks), which makes search even longer.
First of all, I use the rather obvious approach to trying blocks: order them in a sequence, motion blocks types first, then simple ones (fill, two-colour pattern, RLE) with intra DCT and raw being the last. And if the block metric is low enough then the block is good enough and the rest of the modes should not be tried.
Second, there are two encoding modes: quality-based and bitrate-based. For bitrate-based mode I simply manipulate lambda ratio between block distortion and bits and that’s it. For quality mode I actually collected statistics on different files at different quality settings to see what block types are used more or less. I.e. on the highest quality setting intra blocks are not used at all while on low quality settings you don’t see lossless residue or raw blocks.
So I simply used the option to disable different block coding modes (introduced to make debugging other coding modes easier) and modified the list depending on quality setting.
Then I went even further and observed the statistics of the DCT block quantisers used depending on quality settings. As one could reasonably expect, low quality setting resulting in quantisers 12-15 (and rarely 11) while high quality setting used quantisers 0-10 the most. So limiting the quantisers depending on quality was the next logical step.
And here are some other tricks:
- on lower quality levels I use non-zero threshold for RLE block so that e.g. a sequence 4, 3, 5, 4 will be treated as a run of fours;
- in the previous version of the encoder I used Block Truncation Coding (probably the only possible application of it even if a bit unwise), now I’m simply calculating averages of the values above/below block mean value;
- in rate-distortion metric I scale the bits value, doing otherwise often leads to the essentially free skip block being selected almost every time and blocky moving pictures are better than slightly less blocky still one.
Of course it is easy to come with more but it’s been enough for me.
I can confirm that the “official” Bink (1) encoder is very brute-force too and pretty much just tries everything, which is why it’s quite slow. 🙂
Bink 2’s structure is inherently less “combinatorial” so the encoder is a lot closer to the more typical guided search (with a decision tree for mode decision, plus heuristics, all standard stuff).
It’s been a very long time since I played with the Bink Video encoder so I cannot tell how I felt about the encoding speed (my main machine back in the day was a laptop with PII-266 inside, almost every encoder felt slow there).
As for Bink 2, I expressed my feelings about it ten years ago: In ten years every codec becomes Op^H^HJPEG. At least when you stop caring about fine details you can greatly reduce the search space (and it’s somewhat ironic that while Bink Video 2 can be compared to H.263 in terms of employed features, Bink Video 1 can be compared to H.265 with Screen Coding extension even).