Archive for the ‘NihAV’ Category

na_eofdec initial release

Monday, May 11th, 2026

Since I got lucky during weekend with some formats, I got enough of them to release na_eofdec. This is a tool similar to na_game_tool but oriented at generic exotic and obscure formats (or Amiga ones, put them into whatever category you like). So if you’re familiar with that one (why?!) you should have no troubles with the new one either (or at least it should be the same troubles).

The motivation behind it is about the same as with the other tool: decode whatever formats I find interesting enough to implement decoders for but not interesting enough to have them supported in the main NihAV base. Also it serves as a playground for various other things (like MOV muxer in this case, which served as the base for more versatile muxer in NihAV).

Anyway, it is released in hope (but no expectations and definitely no guarantees) that it will be useful for some purpose for others. The release is available at its own sub-page at nihav.org (and there’s a link to it in the appropriate section of this blog too).

NihAV: QT support enhancements

Friday, May 8th, 2026

When I have enough inspiration, I improve NihAV. When I don’t (which is more common state to me), I RE codecs or write blog posts—so here’s one.

First of all, I’ve started adding non-raw encoders for some common QT formats. It’s not that there are no open-source encoders for them, but I do them mostly to find out how it is done and maybe learn something new in the process. For instance, RLE encoding combines skips, runs and pixel copies; this rises the question of optimal encoding as sometimes it may be cheaper to encode a whole area as new pixels instead of a mix of copy+skip+copy. So I’ve implemented a greedy approach (i.e. code longest skip or run and fall back to encoding raw if those two fail) as well as slow but optimal one. It’s a variation of trellis coding: just calculate encoding cost with each mode (skip/run/raw) to all next possible positions and if it’s lower than the existing one, use that mode; at the end simply trace back the decisions that gave least cost at the end and encode them in right order.

Then I also added RPZA encoder. This is essentially the first texture codec before GPUs with the need for texture compression, its main compression mode is encoding 4×4 block with four colours where two colours are linearly interpolated from two explicitly transmitted colours. There is no apparent way on how to do it fast, so I ended up with an extremely simplified scheme: first I calculate the maximum difference between components and pick the one with the largest difference (or code block as single-colour if it’s small enough) to decide what values to pick, then I calculate explicit colours from an average of input pixels close to minimum and maximum ends of that range. I also have a refinement step by running vector quantisation loop to adjust the ends but it’s rarely needed in practice.

There are still more encoders to implement (SMC, SVQ1, IMA ADPCM and MACE) but none of them is interesting beside SVQ1, so probably I’ll write about it when/if I ever get to implementing an encoder for it (it does not matter if The Multimedia Mike has done that over fifteen years ago—NIH is there in the project name for a reason).

Now, surprisingly enough I’ve improved decoding support as well. The original QuickTime had SIVQ codec which is a straightforward 256-entry codebook for 2×2 RGB24 tiles followed by codebook indices. I had read its binary specification some time ago and recently I was able to locate (probably the only existing) sample for it, which is a good reason to write a decoder for it. It was well-spent five minutes of my time. Maybe in the future I’ll also do something about Pixar codecs (Ghidra works better with raw m68k version of the decoder than with 16-bit Windows 3.x version of the same).

And finally I’ve improved the support for multi-descriptor MOV files. I mentioned it some time ago and I got bitten by it again recently. For example, alice_lo_m.mov from samples.mplayerhq.hu got just first frame decoded for me and many QuickTime 1.5 sample videos (with its developers) gave an error on the last frame. For the former it’s because first frame is JPEG and the rest of them are SVQ1, while the latter samples are coded with Cinepak but the last frame may be a special one encoded with RPZA. And there was another file fully encoded with RPZA—but with the majority of it being 160×120 while last dozen of frames or so were 320×240. So I finally got annoyed enough to implement multiple streams per track so at least the frames get marshalled to the correct decoder, even if it leads to the partial streams being rather unusable. Maybe one day I’ll write a tool which will walk through MOV and render all tracks in correct sequence (taking edit lists into account), scaling and adjusting playback rate as needed, producing a raw MOV file that can be played without special hacks; or maybe I don’t hate myself that much.

That’s it for now, don’t expect anything soon (MVS description may appear but who’s waiting for that?).

NihAV: palettisation

Thursday, April 30th, 2026

While I’ve added palettisation support for NihAV about six years ago, it was limited to per-frame palette generation back then. Since I had only two encoders supporting paletted input and both were accepting palette changes, it worked fine. It’s only when I decided to implement MOV muxer I really got a need for palettisation using global palette, so I’ve started experimenting with that.

First and foremost, design. While frame palettisation is a part of NAScale that handles video frame conversion from and to various formats, this palettisation mode is bolted to nihav-encoder. It is actually implemented in two parts: initial pass that decodes input and generates palette at the end, and actual frame palettisation (which can actually work just fine without that pass if you tell it to use some pre-defined palette). I actually started with the second part and added palette generation later (for tests using default QuickTime palette was enough). Then I went even further and extended palette generation to support segmented mode (i.e. palette may change but not for every frame, I made the limit configurable but it should be at least 10 frames—storing palette for each frame before processing is too much).

This mode is a bit fragile since palettes are calculated for the decoded frames and not for processed frames. And for palette segments it’s even worse since it tells the number of frames for which the palette is valid, so framerate conversion will make it a mess. But the alternative leads to madness libavfilter and that’s hard pass for me. I see filters more as a drug that makes multimedia projects shift attention to them, making it more and more about e.g. filter negotiation and complex graphs support and less about playing actual media; consequently, making everything a filter is a sure sign the project is on its way to becoming obsolete (yes, I don’t hold neither DirectShow nor gstreamer in high regard).

Anyway, after overall design description it’s time to talk about implementation details. Palette generation for video differs from palette generation for single image by the sheer amount of data it needs to take into account. So while I have “let’s waste memory and have a table of 64-bit counters for each possible colour” my first mode was putting colours into smaller buckets (bucket index is calculated from the top bits of components) and join similar colours (e.g. differing just by two low bits in each component) when the number of entries gets too high. It is slower and not as accurate but it consumes less memory and performing vector quantisation on hundreds of thousands of entries is much faster than on millions. So it may have its uses.

Then palette segments generation. I don’t do anything fancy and simply calculate coarse colour histogram to decide when to start a new segment. For cut-off criterion I selected the ratio between correlated histograms and auto-correlated new frame histogram. If they’re of about the same magnitude then I can add this frame to a group, update group histogram and continue, otherwise I generate palette for the just finished segment and start a new one. It’s naïve but it seems to work reasonably well.

Palettisation itself consists of finding an appropriate palette entry for the input colour (I’m aware of dithering but haven’t bothered with it yet). I use the same three methods as back then: brute force search, local search and k-d tree search. The latter is faster than the rest but gives horrible quality so I don’t know if I should improve it or throw away. Local search (especially with a small cache for last 32 results) is a nice trade-off between the rest. And brute force search is implemented by filling a small 16MB table which maps each input colour to the palette index; it is slow to generate but it works extremely fast with global palette and reasonably large video (i.e. more than those sixteen million pixels). For segmented palettes it’s better to use local search though.

That’s it. I realise that I’m the only user of such feature but it gave me something to play with and brought some joy implementing it. After all, who else can claim he converted movieCD into animated GIF without using any 16- or even 32-bit code?

NihAV: OS/2 multimedia support

Friday, April 17th, 2026

In theory I should be documenting the codes Paul has shared with me or MVS (did you know that it employs a rather interesting chroma subsampling method—coding three 8×8 blocks in a macroblock but chroma samples have less than a half of coefficients in zigzag order coded) but instead I’ll write about something nobody really cares about.

As some of you may know, RedHat had (at least) two multimedia formats developed: RLE-based PhotoMotion for RedHat PC (later licensed to American Laser Games that seems to extend it somewhat) and gradient-based UltiMotion codec for AVI (the format of choice for VfOS/2).

Since the codec is somewhat unique, I decided to write an encoder for it. This way I can re-encode e.g. some movieCD (a format hardly anybody remembers) to another obscure format nobody remembers exactly just because I can. But the main reason is to learn how it’s organised.

There are three distinctive features it has: shared chroma (i.e. 8×8 super-block can have just one pair of chroma samples instead of coding each 4×4 block with its own pair), quantised values (6-bit luma and 4-bit chroma) and of course gradients. Actually there are seven block coding modes and only half of them are gradient-based—the rest are more conventional skip block, scaled-down block (4 luma samples only), BTC (2 samples plus fill pattern) and raw block.

Gradients here are essentially filling the block in one of the direction a lot like intra prediction works in H.264 and later, the main differences being fewer angles and fill values being transmitted explicitly. Fun thing is that unlike other codecs there’s no easy way to transmit a flat block, you need to code it in some extended way as the simplest (“shallow”) mode codes a coarse gradient with two values, the second value being implicit N+1. More complicated (“LTC”) mode codes a fine gradient but allows only a four-colour combination present in 4096-entry codebook. There’s also an extended mode where you can code any values for a gradient.

This of course poses a challenge of finding a good gradient in reasonable time (because trying all 4096 combinations with all 16 directions may get a bit slow). For shallow coding it’s easier, you essentially have block split into two parts, so checking averages for those parts and seeing if they fit is enough. For LTC and extended mode I applied a similar trick by finding the averages of four samples used in each gradient angle and saw if they fit well enough (for LTC it was also checking that the samples are monotone increasing and checking only the more or less close codebook entries; probably there’s more of optimisation potential but I’m fine with it as is).

Actually I started it gradually: first by implementing simple “raw or skip blocks only” mode, then “any block type that does not introduce additional loss (beside YUV quantisation)” mode, then lossy mode, and finally fast-and-shitty mode. The idea behind the last one is to calculate blocks variance and to use that information to force block selection process (i.e. not try more complex block types on blocks with low variance). As you can guess from the name it did not work out that nice (but it was about twice as fast). But overall lossy mode works rather good and by introducing distortion thresholds I can vary output file size significantly (3-5 times smaller and still not being a block soup). I’m not going to bother with any rate control and overall I consider this experiment done.

In conclusion I’d like to write something but nothing comes to my mind. Stay tuned for more stuff nobody cares about (like obscure codecs or my experiments with palettisation).

(Ir)regular NihAV news

Monday, March 30th, 2026

Here’s another portion of news about what I’ve been doing in last couple of weeks. Hopefully I’ll be able to write about fruity MVS soon.

While most of what I did is not worth mentioning, there are some bits I find funny or curious enough to share.

First of all, I’ve added VAXL and (Clari)SSA decoding support to na_eofdec (just some more and there will be enough formats for 0.1.0 release). And I actually added them mostly to test an IFF parser (derived from my AVI parser and enhanced). The idea is simple: just point it to the data needed to be parsed and provide chunk handlers and it should do the trick. The main enhancement is also providing a condition when to stop—at the end, on any unknown chunk, before some pre-defined chunk, or right after such chunk has been parsed. This way I can control parsing process without relying too much on some special callbacks or states. For example, in VAXL that means I parse header by expecting VXHD chunk and nothing else. The rest of VAXL consists of time code, palette, image data, and audio samples chunks; there I simply stop after BMAP chunk and output reconstructed frame (and audio if present). In SSA (IFF variant, I added FEE one too for completeness sake) each frame is wrapped into FORM DLTA list so there I just parse sub-chunks ignoring unknown ones.

Then I’ve finally had another look at TCA and improved my decoder, going from less than half of samples I had being supported to virtually all of them. While I presumed that video is always compressed with LZW, in reality it also has RLE and raw modes. RLE mode is rather curious at that as it codes run+mode as 00 00 xx yy zz, 00 xx yy or xx number (i.e. if first one or two bytes are zero, read one or two bytes of the number too) with low bit signalling run or skip and the rest being its length (e.g. 55 means skip 42 bytes while 00 01 00 01 means repeating 01 128 times). But since the format is next to impossible to detect statically (no magic constants in the header, so you’d better check packet structure instead), it’ll remain a useless curiosity—just like I love it.

And most of the time I spent on writing MOV muxer. Since I’ve written one for na_eofdec already (raw output only) I had some base to start off at least. Of course it’s wonky and even copying data may or may not work depending on your luck, at least now I have a format to output variable framerate content losslessly (previously the only alternative for VFR content in NihAV was to encode data with RealVideo 4 and Cook). At least now, after all possible enhancements, I can either re-mux old-style MOV into a new one (and the output may actually be playable) or use nihav-encoder -i input.mov -o output.mov --profile raw to decode some more exotic format not supported by anything else (like MoviePak in MacBinary MOV) to a raw format that (hopefully) more tools can understand. I can also use my Cinepak or Indeo 3 encoders there (and Truemotion 1 too in the future just for lulz) and I should probably write encoders for other common QT codecs (even if there are other implementations, up to SVQ1 by The Multimedia Mike). This should keep me entertained for a while. And who knows, maybe it will serve as a base for MP4 muxer if the need for one arises.

That’s it for now, hopefully I’ll have more to write about later.

More boring stuff in NihAV

Thursday, March 12th, 2026

Since the last time I’ve done some more boring stuff for NihAV, more precisely 10-bit H.264 decoder and extended MOV support.

The former is more or less straightforward but it made me regret Rust lack of text-level macro processing. The problem is generating almost identical code for various combinations of bitdepths that differ only by the clipping range and function name suffixes. Probably it can be solved with proc macros but that’s a hole I’m unwilling to jump into. Also supporting both 8- and 16-bit decoding in one module is annoying, so I ended up just putting them next to each other and selecting one for the current mode of operation. This also uncovered the fact that my scaler did not handle high-depth YUV in the most cases, so I had to fix it too. And the ironic thing is that I don’t have enough content to watch in that format (and a couple of sample files I have are Matroska with FLAC audio, which can’t be remuxed to MP4, so there’s that).

In other rather equally useless things, I’ve finally made a MOV muxer for na_eofdec. Of course it’s a simplified one, supporting only raw video and audio tracks, but even that was annoying to support properly (I’m still not sure if I do support it properly). At least when the bitterness passes I should be able to make a MOV muxer for NihAV (and who knows, maybe even implement some QT encoders to exercise it).

But that’s not all regarding MOV. In a recent post I mentioned the oldest known MOV flavours. Finally I’ve managed to extend my MOV demuxer to support beta version of it (it’s still MOV but with a few different details here and there) while alpha version was easier to support in a different demuxer written from scratch. If you’re curious, those files are short clips (about 15 seconds usually) compressed with QT RLE demonstrating some features of upcoming System 7 in rather symbolic form. Here’s a GIF of virtual memory feature presentation.

That’s about it for now, I hope to do more exciting things next. For instance, add some Amiga formats to na_eofdec. And most importantly, Paul told me about new RAD Audio codec which I still have to look at thoroughly. From a first glance while audio in Bink is a lot like MPEG Audio Layer II (i.e. simply quantise and write coefficients with a fixed amount of bits), this new codec is more akin to AAC LC as it apparently has long/short frames division and (at least two) Huffman tables for coefficients. Another change is that Bink audio was stored in simple container formats with fixed structure, new format looks more like an elementary stream with as few bits as possible spent on coding different fields. Let’s see how it turns out…

A word about some JPEG-based codecs

Wednesday, March 4th, 2026

As I mentioned previously, I don’t want to work on game codecs for a while, so I picked other stuff instead. For instance, I’ve fixed a bug in my Indeo 3 encoder (which nobody uses, myself included), refactored some NihAV code and added a couple of decoders.

First, there is PDQ2 decoder. It actually turned out to be slightly more interesting than I expected as there’s a version of the codec for Saturn version of the game (it’s not that different yet somewhat different nevertheless).

Then I’ve finally managed to figure out MoviePak. After locating a decompressor binary and trying to disassemble it as raw M68k code I could locate enough code and data to figure out that since it uses JPEG codebooks it’s likely to be based on JPEG. And after learning a bit more about how it works and NOPing out A-line _StripAddress calls (that was the most interesting part actually—you can’t find it without knowing terminology and even then just barely; I was lucky to try searching for information about 68881, learning terminology and finally finding a reference of Macintosh Toolbox calls) I managed to make Ghidra happy enough to produce a usable decompilation. After that it was just a question of re-using JPEG decoder with a slightly different header format, which finally made me do a refactoring of JPEG decoding.

And since I’ve factored out common JPEG decoding support for MoviePak, Radius Studio and actual motion JPEG decoder, I decided to add a video codec for effects used in proDAD Adorage video editor (its FOURCC is pDAD, quite surprisingly). This one could be reverse engineered just by looking at the frame data. Each frame consists of 20-byte header, JPEG image plus alpha channel coded as PNG or JPEG. Since I have not bothered to write a motion PNG decoder (it’s very uncommon after all and I’m not NIHing image library—at least not yet), I decode just the first part. Maybe I’ll add a PNG decoder support and re-visit this decoder for proper alpha support, but for now this seems unlikely.

I’ll keep working on some boring stuff nobody cares about but maybe I’ll have a codec or two to write about as well.

Random NihAV news

Sunday, February 8th, 2026

As I mentioned previously, after making 0.5.0 na_game_tool release I’d rather work on something else for a change (if at all), so here are some bits about what I’ve been doing since:

  • first of all, I’ve implemented support for AV, a Polished AVI format. By that I mean the format comes apparently from a Polish company and it’s simplified remux of AVI data: first there’s BITMAPINFOHEADER (and optionally palette) followed by video stream header (starting with the familiar vids tag), then there may be WAVEFORMAT structure with auds stream header, then you have table of contents (video and audio frame sizes plus flags) followed by video and audio chunks. The format was trivial to guess and I’ve added support for it because why not;
  • I’ve also finally implemented functions for reading arrays of integers. First I’ve introduced them to na_game_tool and tried them in some decoders, and then ported them to the main NihAV codebase. The idea behind it is that reading data directly into destination array (with optional byte-swapping) is faster than reading data, re-interpreting it as an integer and finally copying it into the destination buffer. I had a specific version of it implemented in MP4 demuxer already (because otherwise opening a 2-3 hour long video would cause a noticeable delay) but overall it’s nicer to have just one call instead of a loop;
  • in other things, I’ve re-started na_eofdec development using the current na_game_tool codebase. That does not mean I’m starting developing it (or going to do so in the nearest feature) but at least when I eventually get to it, I can add some archive extraction modules as well. Beside that it should be a sandbox for testing MOV muxer that I want to write sooner or later (kinda like OpenDML AVI muxer in NihAV is a backport of one from na_game_tool). Of course it would need to get a couple more formats to test it on but I’m not in a hurry;
  • speaking of MOV, I’ve improved support for MOV in MacBinary II. So far I’ve seen four flavours of it: essentially flat MOV with MacBin header, MOV with mdat box in data fork and moov atom in resource fork, the same but data fork containing mdat contents without size and tag, and even older format (with samples from 1990) with much flatter structure (i.e. lots of nesting chunks are not present at all). The first three are detected and handled more or less fine now, I’ll try to support it even if as a historical curiosity;
  • there was one major code refactoring in NihAV, namely I’ve put demuxing handling code into a new nihav_hlblocks crate and made all my tools use it instead of dragging local copies of it. If you wonder why this was necessary, that’s because you can have normal demuxers (that return full packets for the streams) as well as raw stream demuxers (that return pieces of data belonging to a stream and you need to form full packet using a packetiser). And of course you can have simply raw streams like MP3 that needs a packetiser but no demuxer. That’s why I’ve made a DemuxerObject (yes, very original name) that encapsulates them all and represents all of them as a normal demuxer;
  • and finally I’ve discovered that I can call SDL_UpdateYUVTexture instead of copying frame data into the texture manually. And then I discovered that it did not work with the time display (because it also wrote to the texture directly in presumption that it will update some pixels on the frame; there is a note in the documentation telling not to do so but it worked by chance before). So I’ve changed it to render a separate small RGBA texture and blit it over the frame instead—like it should’ve been done from the start. I somewhat wonder when I’ll have to adapt it all to SDL3 but apparently I can postpone it for now.

That’s all for now. There’s a lot to do, maybe some of it will actually be done.

na_game_tool 0.5.0 released

Monday, January 19th, 2026

Grab it from the usual place if you’re interested in it for some unfathomable reason. I’ll try to document the formats when I have time (and multimedia.cx server does not suffer from crawlers).

Update from January 30: I’ve uploaded a fixed version as I broke extracting old-style Cryo archives during refactoring (and before committing), so now it should work. The rest will (not) work in the same way as before.

na_game_tool: final stretch before 0.5.0

Thursday, January 15th, 2026

As I’m somewhat tired with na_game_tool development (for now), so I’m just picking bits for the release goal to do that and switch to something else (I’ve found a couple of interesting MOV files to analyse, maybe I can improve my Indeo 3 encoder and such). I’ve added support for HNM5 and HNM6 formats, reached the self-imposed limit of at least twelve original decoders and essentially I need to add 7th Level archive format and that’s it. Meanwhile I can talk about the formats I added support for, I want to add support for (but not right now) and the formats I’m hesitating to add support for.

So, supported formats first.

There’s GRN format used in Genesia game. Frames are stored raw, compressed with RLE (which looks a lot like FLI delta RLE compression but working on pixel pairs) and the data may optionally be compressed with LZSS afterwards. And if would’ve not been a format from French company without some weirdness. In this case audio stream data is interleaved with video in the simplest way: first you have 2kB header, then a certain number of 2kB sectors with initial audio data (the number is in the header), then it is 8kB of video stream followed by 2kB of audio followed by 8kB of video stream etc etc. So video frame data may suddenly have 2kB of audio inside it. Of course it was easily solved with a custom de-interleaver but you’d expect it to see in industry-standard applications and not in a game from early 1990s.

And there’s MGIF from Gates of Skeldal. Despite the name it was only inspired by GIF in the sense it uses similar LZW compression scheme but not the format (unlike HAF). Another curious detail that I haven’t seen elsewhere is using a simple pre-processing: bytes are coded as a difference to the previous ones. What made it annoying is that this prediction resets when LZW decoder resets (i.e. when the “reset dictionary” code is encountered), so you have to implement it inside LZW decompressor instead of making a pass afterwards. Still, it gets a point for the originality.

Finally there is Celestial Impact intro.dxv. This file employs RLE to compress not merely image but its palette and some additional data (it did not look like a sound to me and I have no idea what it is) as a single array.

Now for the formats I want to support in some (distant) future. Beside the usual “whatever I can find” it definitely includes AVI and SMS—two Smacker-based formats. Maybe I can implement some Cinepak-based console formats while at it.

And speaking of console formats, here’s a fun format called DDV and used in Oddworld games (or maybe just one of them). There’s a reverse-engineered implementation of the decoder and by the look of it this format encapsulates MDEC—which makes it a perfect candidate for librempeg. It has support for some MDEC-based formats already after all.

That’s it, hopefully the next post will be about the release already.