Another reason for NihAV

So instead of doing something productive like adding missing functionality bits and writing documentation I wasted my time on adding some QuickTime decoders. And while wasting time on adding SVQ1, SVQ3, QDMC and QDM2 decoders it became apparent why NihAV is a good thing to exist.

Implementing two of them was not a very big deal but implementing SVQ3 and QDM2 decoders took more than a week each because there are only two specifications available for them and both are equally hard to comprehend: the first one is the official binary specification, the second one is source code in libavcodec which is derived from the former.

The problem arises when somebody wants to understand how it works and/or reimplement the code and both SVQ3 and QDM2 decoder demonstrate two different aspects of that problem.

SVQ3 decoder is based on some draft of H.264 (or ex-MPEG/AVC if you’re from Piedmont) with certain extensions mostly related to motion compensation. Documentation for it was scarce and because of optimisations and integration with common H.264 decoder bits it’s hard to understand some of the things. One of those is intra prediction with two modes having SVQ3-specific hacks hidden in libavcodec/h264pred.c (those are 16×16 plane prediction mode giving transposed result and 4×4 diagonal down prediction being simplified and not relating on pixels not immediately top/left from the block) and another one is block coefficients decoding function. It took me quite a while to realize that it actually decodes three different kinds of blocks: single 4×4 block with zigzag scan, 4×4 block divided into two parts with interlaced scan, and 2×2 block. I’ve documented most of that in The Wiki (before that nobody has touched that page for almost ten years; sometimes I feel like I’m the only person contributing there).

QDM2 is horrible in different way. It is slightly improved translation of the original binary specification with hardly any idea how it works (there are still names like local_int_8 in the code). Don’t get me wrong, back in 2003-2005 when reverse engineering was done the only tools you had were debugger, disassembler (you’re lucky if it’s not the one provided by debugger) and no decompilers at all (IIRC rec appeared much later and was of limited usefulness, especially on multi-megabyte QT monolith—and that’s assuming you’re not doing that on Mac with even less tools available). I did some of such work back then as well so I understand how hard it is and how you’re happy that it works somehow and you can ship it and forget about it.

Another thing is that now it’s clear that QDMC and QDM2 are predecessors of DT$ LBR (aka Express) and use the same principles (QDMC simply coded noise and tones, QDM2 is almost like LBR but without some features like LPC or multichannel audio and with different chunk structure), but back in the day there was no documentation on LBR (or LBR itself for that matter).

But the main problem is that nobody has tried to understand the code since. It became a so-called category killer i.e. its existence prevents others from doing something similar. At least until some idiot tried to do another implementation in NihAV.

And here we have the reason for NihAV to exist: it advances the understanding of codecs for me (and I document results in The Wiki) resulting in different implementations that are (hopefully) easier to understand and sometimes even fix long-standing bugs. I hope this shall convince you that sometimes it’s good to have reimplementation of the decoder even if an existing implementation is good enough (as far as I remember the only time a decoder was rewritten in FFmpeg was when a reverse-engineered Indeo 3 decoder that crashed on damaged content almost every time was replaced with a reverse-engineered Indeo 3 decoder where a guy had the idea how it works).

But back to QDM2: while my decoder is not finished yet and I probably won’t bother with inter-frames in it (I’ve never seen any samples with those), it still decodes sweeps much better. That’s mostly because of the various bugs I’ve uncovered (also while discovering that Ghidra effectively does not allow to edit about a megabyte large decoder context). Since I have no incentive to produce a patch and people who created the decoder are long gone from the project, here are some spotted bugs: wrong coarse quantiser band selection (resulting in noise generated in wrong frequency range), reading bits past the chunk end (because is some cases checks are missing), ignoring group 4 tones because of the wrong conditions, some initial variables are set in the wrong way too. Nevertheless it mostly works and it was very useful for mapping the functions in the binary specification (fun fact: QDM2 decoder is located in QuickTime.qts while QDMC is located in QuickTimeInternetExtras.qtx).

2 Responses to “Another reason for NihAV”

  1. Thanks for your continued contributions to the Wiki. As long as you keep contributing, I’ll keep it up and running. 🙂

  2. Kostya says:

    The world would be much worse without you running it. From my side I can only remain thankful for it existing and add another piece of information time from time (like I’ve added QDM2 description today).

Leave a Reply