Archive for May, 2019

Bink-b: Encoder

Friday, May 31st, 2019

Recently I’ve been contacted by some guy working on a mod campaign for Heroes of Might and Magic III. The question was about the encoder for videos there. And since the original one is not likely to exist, I just wrote a simple one that would take PGMYUV image sequence and encode it. Here’s the gzipped source.

It took a couple of evenings to do that mostly because I still have weak symptoms of creeping perfectionism (thankfully it’s treated with my laziness). BIKb does not have Huffman-coded bundles, so the simplest straightforward encoding would be: write block type bundle (13-bit size and 4-bit elements), write empty other bundles, write several bundles containing pixels and you’re done. There’s a proper approach: write a full-featured encoder that takes input in several formats and that encodes using all possible features selecting the best quality for the target bitrate. There’s a hacky approach—translate later versions of Bink into BIKb (and then you remember that it has different motion compensation scheme so this approach won’t work). I’ve chosen something simple yet with some effectiveness: write an encoder that employs only vector quantisation and motion compensation for non-overlapped blocks plus add a quality setting so users can play with output size/quality if they really need it.

So how does the block encoding work? Block truncation coding, the fast and good way to quantise block into two colours (many video codecs back in the day used it and only some dared to use vector quantisation for more than two different values per block). Essentially you just calculate average pixel value and select two values depending on how many pixels in the block are larger than average and by how much they deviate. And here’s where quality parameter comes into play: depending on it encoder sets the threshold above which block is coded as is (aka full mode) instead as two colours and pattern in which they occur (of course if it’s a solid-colour fill it’s always coded as such). As I said, it’s simple but quite effective. Motion compensation is currently lossless i.e. encoder will try to find only the block that matches exactly (again, it can be improved but that would only lead to longer implementation times and even longer debugging times). This makes me appreciate the work on Smacker and Bink 1 video codecs and encoders for them even more.

Overall, it was a nice diversion from implementing Duck decoders for NihAV but I should probably return to it. The sooner it’s done the sooner I can move to something more exciting like finally experimenting with vector quantisation, or trying to write a player, or something else entirely. I avoid making plans but there are many possibilities at hoof so I just need to pick one.

VP3-VP6: the Golden (Frame) Age of Duck Codecs

Friday, May 24th, 2019

Dedicated to Peter Ross, who wrote an opensource VP4 decoder (that is not committed to CEmpeg yet at the time of the writing).

The codecs from VP3 to VP6 form a single codec family that is united not merely by the design but even by the header—every frame in this codec (sub)family has the same header format. And the leaked VP6 format specification still calls the version field there Vp3VersionNo (versions 0-2 used by VP3, 3 is used by VP4, 5 is for VP5 and 6-8 is for VP6). VP7 changed the both the coding principles to mimic H.264 and the header format too. And you can call it the golden age for Duck because it’s when it gained popularity with VP3 donated to open-source community (and xiphed to Theora which was the only patent-free(ish) opensource video codec with decent performance back then) to its greatest success found in VP6, employed both in games and in Flash video (remember when BaidUTube still used .flv with VP6 and N*llyMos*r ADPCM or Speex?). And today, having gathered enough material, I want to give an overview of these codecs. Oh, and NihAV can decode VP30 and VP31 now.
(more…)

NihAV: rust-clippy experience

Saturday, May 18th, 2019

As I’ve mentioned in the previous post, I’ve finally tried rust-clippy to see what issues and suggestions it will have on my code. The results are not disappointing if you take the tool name seriously.
(more…)

NihAV: after clean-up

Friday, May 17th, 2019

Since the clean-up work on NihAV is done and I progress with Truemotion VP3 decoder, it’s a good time to talk about what I’ve actually done—there’s even more material to write waiting in the queue.

The intent was to make all frame-related stuff thread-safe and improve efficiency a bit. In order to do the former I had to replace most of the references from Rc<RefCell<T>> to Arc<T> and while doing it I introduced aliases like type NAFrameRef = Arc<NAFrame> and .into_ref() methods to convert object into ref-counted version. This helped when I tried switching from one implementation of reference counter to another and will make it easy to switch again if I ever need that (hopefully not). Now about improved efficiency and how it’s related to the ref-counting.

There’s a straightforward way of dealing with frames: you allocate the picture, fill it, dispose, allocate a new one, etc etc. And there’s a more effective way: you allocate several pictures at once, select an unused one, fill, return to the pool when it’s not needed any more. That is where reference counting comes into play and where Rust default structures don’t help. Frame pool owns the reference and decoder gets a second copy. And Rust Arc is intended for single ownership: when you try to access the shared object it will simply clone it so you end up working with a copy (which defies the purpose). So I had to NIH my own NABufferRef<T> which keeps reference counts and still allows shared access even for writing (currently it does that in all cases but if I need to add some guards the API won’t have to be changed for that). The implementation is very simple: the structure contains a raw pointer to a structure that contains actual object and AtomicUsize counter. The whole implementation is ~2.2kB relying just on std crate.

And finally I’ve made a picture pool. The difference between picture and frame is all additional metadata picture should not care about (like timestamps, stream information and such). Because of the design decisions I have three different picture formats (implemented for 8-, 16- and 32-bit element sizes, Rust does not like aliasing after all), which means I need to provide decoder with all three picture pools because we can’t say in advance which one codec will use (if at all—the option to allocate new non-pooled pictures is still there). Also I want to keep those pools external in case the code around it wants to do keep more pictures in it (e.g. 2-3 pictures required by decoder and 25 pictures pre-buffered for the display). This resulted in a structure called NADecoderSupport that contains picture pools and may have something else added late. Of course people might argue that it’s much better to have AVCodecContext with a myriad of fields you can set directly or via utility functions but I’d rather not have one single structure. Though it might be a good place to put various decoder options there (so that decoder can ignore them at its leisure).

Since I said I did it to increase efficiency I should probably give some numbers too: RealVideo 3/4/6 decoders now use buffer pool (for three frames obviously) and reallocate it on format change. Decoding time got reduced by 4-5% from using the pool. Currently I don’t care about speed much but I may convert more decoders to it if the need arises.

In conclusion I want to say that even I did not enjoy doing that work much, it was needed and gave me some experience plus some improvements in code and design. So it was not a wasted effort.

P.S. I also installed rust-clippy since it’s in stable now and tried to fix errors and warnings it reported. But that is a story for another post.

Zähringerstädte

Monday, May 6th, 2019

Today I want to talk about local dynasty that was rather short-lived but left quite an impressive legacy.
(more…)