NihAV — Progress Report

Obviously it moves very slowly: I spend most of my time on work, sleep, cooking and travelling around. Plus it was too hot to think or do anything productive.

Anyway, I’ve completed IMC/IAC decoder for NihAV. In case you’ve forgotten or didn’t care to find out at all, the names stand for Intel Music Coder and Indeo Audio software with IAC being slightly upgraded version of IMC that allows stereo and has tables calculated for every supported sample rate instead of the set of them precalculated for 22kHz. And despite what you might think it is rather complex audio codec that took a route of D*lby AC-3, G.722.1/RealAudio Cooker and CELT—parametric bit allocation codecs. It’s the kind of audio codecs that I dislike less than speech codecs but more than the rest because they have large and complex function that calculates how many bits/values should be spent on each individual coefficient or subband. In IMC/IAC case it gets even worse since the codec uses floating point numbers so the results are somewhat unstable between implementations and platforms (a bit more on that later). Oh, and this codec has I- and P-frames since some blocks are coded as independent and others are coded using information from the previous block.

Rust does not have much to do with C so you cannot simply copy-paste code and expect it to work and it’s against the principles of the project anyway. Side note: the only annoying Rust feature so far is array initialisation, I’d like to be able to fill array in a loop before using it without initialising array contents to some default value (which I can’t do for some types) or resorting to mem::uninitialized() and ptr::write(). Anyway, I had to implement my own version of the code so it’s structured a bit differently, has different names, uses bitstream reader in MSB16LE mode instead of block swapping and decodes most files I could test without errors unlike libavcodec—so it’s NIH all the way!

I wasted time mostly on validating my code against the binary specifications so this version actually decodes most files as intended while libavcodec fails to do that. To describe the problem briefly, it all comes from the same place: the codec first produces bit allocation for all bits still available then determines how to read flags for skipping coefficients in some bands, reads those flags and adjusts bit allocation for the number of bits freed by this operation; the problem is that bit allocation may go wrong and in result skip flags take more bits than the coefficients that would be coded otherwise and decoder would fail to adjust bit allocation for that case (it’s not supposed to do that in the specification) and will read more bits than the block contains. For the thirty-something IMC and IAC in AVI samples only one fails now for me because in bit allocation the wrong band gets selected for coefficient length decreasing. And the reason is the difference in the fourth or fifth digit after the decimal point in one array of constants that makes the wrong value minimum (and thus selected for coefficients length decreasing). Since it takes several minutes with gdb+mplayer2 to get information at this point (about at 10-second position in 14-second audio) I decided not to dig further.

Also I had to write other pieces of code like split-radix FFT, byte writer and WAV dumper that accepts audio packets and writes them with the provided ByteWriter.

P.S. Nanobenchmarks ahoy: decoding the longest IMC stream that I had (a bit more than two minutes) takes 0.124s with avconv and 0.09s with nihav-tool. Actual decoding functions take about the same time though Rust implementation is still faster by couple percents and my FFT implementation is slower (but on the other hoof it’s called for every frame since it decodes that file without errors).

P.P.S. So next is Indeo 4/5 with all wonderful features like scalable decoding, B-frames and transparency (that reminds me that Libav and ScummVM had a competition who would be the last to implement proper transparency support for Indeo 4, now they both might win). And then I’d probably go back to implementing the features I wanted: being able to tell the demuxer to discard and don’t demux certain streams, better streams reporting from the demuxer, seeking and decoder reset, frame reordering functionality, maybe WAV support too. And then maybe back to decoders. I want to have several codec families fully implemented, like RAD (Smacker, Bink and Bink2), Duck/On2 (TM1, TM-RT, TM2, TM2X, TM VP3, VP4, VP5, AVC, VP6 and VP7) and RealMedia (again). But I’m not in a hurry.

P.P.P.S. I’m not going to publish all source code but bits of it may be either posted when relevant or leaked to rust-av, its developer(s) has(have) shown some interest, so enquire there.

9 Responses to “NihAV — Progress Report”

  1. Luca Barbato says:

    The rust-av developers are keen in port/integrate all of this.

    My plan to write an avi demuxer while learning nom got killed by some emergency at work plus yesterday way too humid day.

  2. Paul says:

    Show the decoder source code!

  3. Kostya says:

    Just wait when it treacles down to rust-av, there’s nothing interesting in the code to show it.

  4. Paul says:

    But it doesn’t have bugs that lavc implementation have.

  5. Kostya says:

    Is that a bad thing?

    If somebody wants to get rid of bugs in libavcodec they can do the same as I did – compare against the reference decoder. And, obviously, if I wanted to fix libavcodec/imc.c I’d have sent a patch.

  6. Paul says:

    But you already did the hard work…

  7. Kostya says:

    Not hard, just boring. And don’t mind if anybody else does it.

    Here’s GDB script I used for mplayer2 + binary loader for (imc requires a different driver) – It should dump a lot of information for the played IAC file if you dare to run it. I made my decoder dump similar information and compared it and that’s how I found most errors like e.g. the amount of free bits passed to bit allocation was different, the skip flags sometimes differed etc etc. Happy hacking!

    P.S. Giving my new source code would be counterproductive. People need to learn to do such stuff themselves.

  8. Paul says:

    Nobody else does it, so tell me lines in lavc that are buggy.

  9. Kostya says:

    Sigh, I actually had to rewrite some things after the specification so I don’t know what errors are there.
    Here are two that I still remember:
    In imc_refine_bit_allocation() there if (((band_tab[i+1] - band_tab[i]) * 1.5 > sumLenArr[i]) && ... — it seems the reference uses (int)((band_tab[i+1] - band[i]) * 1.5);
    and in imc_decode_block() the call to bit_allocation() passes the wrong argument sometimes. The reference calculates it in completely different way so there’s no direct correspondence to what libavcodec does. It assumes the code has fixed header with different size depending on stream_format_code (like 25 or 43) and then adds the number of bits read starting for levels and some internal flag values.
    There might be more.