I’ve mentioned previously that I played with my H.264 decoder trying to make it multi-threaded. Now I went a bit further and plugged it into my video player. So now instead of hopelessly lagging on 720p video it can play it in real time just fine—so after improving my player even further (and enabling assembly optimisations when Rust compiler is good enough for that) I can use it to play most of the videos I care about without resorting to the external decoders or players. And in theory using it more will lead to fixing and polishing it more thus forming a stable loop.
Anyway, the code is not public yet as I hacked this new encoder in a separate crate and I still need to merge it back and clean up a bit, but I’d like to describe the interfaces and my reasons behind them.
So, multi-threaded decoder has a separate interface (for obvious reasons). I thought about writing a wrapper for single-threaded decoders to behave like multi-threaded ones but decided against it (at least for now). NADecoderMT
has the following methods:
init()
—initialises the decoder. One of the parameters is number of threads to use. IMO it’s the caller that decides how many threads it can spare as the decoder does not know what will be done in parallel (maybe there’s another multi-threaded decoder or two are running);
can_take_input()
—queries if the decoder is ready to queue the next frame for decoding. Of course you can call queue_pkt()
and check if it accepted the input but it may not always be desired (e.g. if we need to retrieve an input packet and then hold it waiting until the decoder is ready to accept it);
queue_pkt()
—tries to queue the next frame for decoding;
has_output()
—checks if the decoder has produced some frames for the output. Since get_frame()
is waiting for a frame to be decoded this function is necessary unless you want to block the thread calling the decoder;
get_frame()
—waits until at least one frame is decoded and returns it (or a special error if there are no frames to be decoded);
flush()
—stops decoding all frames and clears the state (e.g. after seek).
Another peculiarity of this decoder interface is that it operates on pairs of a frame and its sequential number. The reason is simple: you get decoded frames out of order so you need to distinguish them somehow (and in case of a decoding error we need to know which frame caused it).
This also leads to a special frame reorder mechanism for such codecs. I’ve created MTFrameReorderer
that requires you to “register” frame for decoding (providing you with an ID that is fed to the decoder along with frame data) and to “unregister” frame on error (that’s one of the places where returned frame ID comes in handy). Unfortunately it’s not possible to create a generic reorderer that would a) work completely codec-agnostic b) not require a whole file (or an indefinitely long sequence of frames) to be buffered before output and c) produce monotone increasing sequence of frames. Considering how H.264 has no real concept of frames and can build a pyramid of referenced frames adding layer by layer (and mind you, some frames may have an error during decoding and thus not present in output). I simply gave up and made a heuristic that checks if we have enough initial frames decoded and outputs some of them if it’s possible. At least it seems to work rather fine on the conformance suite (except for a couple of specially crafter files but oh well).
Maybe in the future I’ll try more multi-threaded decoders but for now even one decoder is enough, especially such practical one. Still, I need to find something more interesting to do.