Since I don’t have anything to do with NihAV
at the time (beside two major tasks that always make me think about doing anything else but them) I decided to look at what tricks did the original Cinepak encoder have.
Apparently it has essentially three settings: interval between key frames (with maximum and minimum values), temporal/spatial quality (for deciding which kinds of coding should be used) and neighbour radius (probably for merging close enough values before actual codebook is calculated).
Skip blocks are decided by sum of squared differences being smaller than the threshold (calculated from the time quality); V1/V4 coding is decided by calculating sum of 2×2 sub-block variances and comparing it against the threshold (calculated from spatial quality).
Codebook creation is done by grouping all blocks into five bins (by logarithm of the variance) and trying to calculate a smaller codebook for each bin independently (so together they’ll make up the full 256-entry codebook).
Overall even if I’m not going to copy that approach it was still interesting to look at.
Merging close vectors before codebook computation sounds like a useful feature for speeding up cinepakenc.c. The ELBG implementation might need to be adjusted so that such merged vectors are not “discriminated against”.
Using quality instead of lambda is also an idea that has struck me, and if used for skip block selection then codebook computation becomes even faster, since those vectors never enter ELBG at all.
I suspect codebook computation becomes much faster when you partition it as well (as the complexity is far from being linear or even nlogn). Actually I might want to try it myself (but later).
And while we’re talking about skipping entries for the codebook creation, low variance means it can be coded as V1 better thus making V4 search smaller. I suspect you can convert lambda to skip and V1/V4 thresholds, even if not in the same way as the original does (coding costs are fixed after all).
I suspect error can be estimated using variance (between vectors) and codebook size, which would allow deciding immediately what mode to use. Sometimes scores for skip blocks are so low that V1 and V4 can both be ruled out (that is assuming v1_error = v4_error = 0). With good estimates for the errors yet more skip blocks can be immediately decided on.
I suspect converting lambda to variance and then computing codebook sizes from some power of the ratio between total variance and target variance, would allow for a much quicker encoding. Gathering statistics on this would be a good first step.
Well, selecting the best mode all all blocks (considering how they can affect codebooks as well) is an NP-hard task so you just trade the accuracy for speed by using heuristics.
Anyway, the encoder is old and probably of historical interest only. If you manage to get some inspiration from it that’s good enough.