Archive for the ‘NihAV’ Category

NihAV: palettisation

Thursday, April 30th, 2026

While I’ve added palettisation support for NihAV about six years ago, it was limited to per-frame palette generation back then. Since I had only two encoders supporting paletted input and both were accepting palette changes, it worked fine. It’s only when I decided to implement MOV muxer I really got a need for palettisation using global palette, so I’ve started experimenting with that.

First and foremost, design. While frame palettisation is a part of NAScale that handles video frame conversion from and to various formats, this palettisation mode is bolted to nihav-encoder. It is actually implemented in two parts: initial pass that decodes input and generates palette at the end, and actual frame palettisation (which can actually work just fine without that pass if you tell it to use some pre-defined palette). I actually started with the second part and added palette generation later (for tests using default QuickTime palette was enough). Then I went even further and extended palette generation to support segmented mode (i.e. palette may change but not for every frame, I made the limit configurable but it should be at least 10 frames—storing palette for each frame before processing is too much).

This mode is a bit fragile since palettes are calculated for the decoded frames and not for processed frames. And for palette segments it’s even worse since it tells the number of frames for which the palette is valid, so framerate conversion will make it a mess. But the alternative leads to madness libavfilter and that’s hard pass for me. I see filters more as a drug that makes multimedia projects shift attention to them, making it more and more about e.g. filter negotiation and complex graphs support and less about playing actual media; consequently, making everything a filter is a sure sign the project is on its way to becoming obsolete (yes, I don’t hold neither DirectShow nor gstreamer in high regard).

Anyway, after overall design description it’s time to talk about implementation details. Palette generation for video differs from palette generation for single image by the sheer amount of data it needs to take into account. So while I have “let’s waste memory and have a table of 64-bit counters for each possible colour” my first mode was putting colours into smaller buckets (bucket index is calculated from the top bits of components) and join similar colours (e.g. differing just by two low bits in each component) when the number of entries gets too high. It is slower and not as accurate but it consumes less memory and performing vector quantisation on hundreds of thousands of entries is much faster than on millions. So it may have its uses.

Then palette segments generation. I don’t do anything fancy and simply calculate coarse colour histogram to decide when to start a new segment. For cut-off criterion I selected the ratio between correlated histograms and auto-correlated new frame histogram. If they’re of about the same magnitude then I can add this frame to a group, update group histogram and continue, otherwise I generate palette for the just finished segment and start a new one. It’s naïve but it seems to work reasonably well.

Palettisation itself consists of finding an appropriate palette entry for the input colour (I’m aware of dithering but haven’t bothered with it yet). I use the same three methods as back then: brute force search, local search and k-d tree search. The latter is faster than the rest but gives horrible quality so I don’t know if I should improve it or throw away. Local search (especially with a small cache for last 32 results) is a nice trade-off between the rest. And brute force search is implemented by filling a small 16MB table which maps each input colour to the palette index; it is slow to generate but it works extremely fast with global palette and reasonably large video (i.e. more than those sixteen million pixels). For segmented palettes it’s better to use local search though.

That’s it. I realise that I’m the only user of such feature but it gave me something to play with and brought some joy implementing it. After all, who else can claim he converted movieCD into animated GIF without using any 16- or even 32-bit code?

NihAV: OS/2 multimedia support

Friday, April 17th, 2026

In theory I should be documenting the codes Paul has shared with me or MVS (did you know that it employs a rather interesting chroma subsampling method—coding three 8×8 blocks in a macroblock but chroma samples have less than a half of coefficients in zigzag order coded) but instead I’ll write about something nobody really cares about.

As some of you may know, RedHat had (at least) two multimedia formats developed: RLE-based PhotoMotion for RedHat PC (later licensed to American Laser Games that seems to extend it somewhat) and gradient-based UltiMotion codec for AVI (the format of choice for VfOS/2).

Since the codec is somewhat unique, I decided to write an encoder for it. This way I can re-encode e.g. some movieCD (a format hardly anybody remembers) to another obscure format nobody remembers exactly just because I can. But the main reason is to learn how it’s organised.

There are three distinctive features it has: shared chroma (i.e. 8×8 super-block can have just one pair of chroma samples instead of coding each 4×4 block with its own pair), quantised values (6-bit luma and 4-bit chroma) and of course gradients. Actually there are seven block coding modes and only half of them are gradient-based—the rest are more conventional skip block, scaled-down block (4 luma samples only), BTC (2 samples plus fill pattern) and raw block.

Gradients here are essentially filling the block in one of the direction a lot like intra prediction works in H.264 and later, the main differences being fewer angles and fill values being transmitted explicitly. Fun thing is that unlike other codecs there’s no easy way to transmit a flat block, you need to code it in some extended way as the simplest (“shallow”) mode codes a coarse gradient with two values, the second value being implicit N+1. More complicated (“LTC”) mode codes a fine gradient but allows only a four-colour combination present in 4096-entry codebook. There’s also an extended mode where you can code any values for a gradient.

This of course poses a challenge of finding a good gradient in reasonable time (because trying all 4096 combinations with all 16 directions may get a bit slow). For shallow coding it’s easier, you essentially have block split into two parts, so checking averages for those parts and seeing if they fit is enough. For LTC and extended mode I applied a similar trick by finding the averages of four samples used in each gradient angle and saw if they fit well enough (for LTC it was also checking that the samples are monotone increasing and checking only the more or less close codebook entries; probably there’s more of optimisation potential but I’m fine with it as is).

Actually I started it gradually: first by implementing simple “raw or skip blocks only” mode, then “any block type that does not introduce additional loss (beside YUV quantisation)” mode, then lossy mode, and finally fast-and-shitty mode. The idea behind the last one is to calculate blocks variance and to use that information to force block selection process (i.e. not try more complex block types on blocks with low variance). As you can guess from the name it did not work out that nice (but it was about twice as fast). But overall lossy mode works rather good and by introducing distortion thresholds I can vary output file size significantly (3-5 times smaller and still not being a block soup). I’m not going to bother with any rate control and overall I consider this experiment done.

In conclusion I’d like to write something but nothing comes to my mind. Stay tuned for more stuff nobody cares about (like obscure codecs or my experiments with palettisation).

(Ir)regular NihAV news

Monday, March 30th, 2026

Here’s another portion of news about what I’ve been doing in last couple of weeks. Hopefully I’ll be able to write about fruity MVS soon.

While most of what I did is not worth mentioning, there are some bits I find funny or curious enough to share.

First of all, I’ve added VAXL and (Clari)SSA decoding support to na_eofdec (just some more and there will be enough formats for 0.1.0 release). And I actually added them mostly to test an IFF parser (derived from my AVI parser and enhanced). The idea is simple: just point it to the data needed to be parsed and provide chunk handlers and it should do the trick. The main enhancement is also providing a condition when to stop—at the end, on any unknown chunk, before some pre-defined chunk, or right after such chunk has been parsed. This way I can control parsing process without relying too much on some special callbacks or states. For example, in VAXL that means I parse header by expecting VXHD chunk and nothing else. The rest of VAXL consists of time code, palette, image data, and audio samples chunks; there I simply stop after BMAP chunk and output reconstructed frame (and audio if present). In SSA (IFF variant, I added FEE one too for completeness sake) each frame is wrapped into FORM DLTA list so there I just parse sub-chunks ignoring unknown ones.

Then I’ve finally had another look at TCA and improved my decoder, going from less than half of samples I had being supported to virtually all of them. While I presumed that video is always compressed with LZW, in reality it also has RLE and raw modes. RLE mode is rather curious at that as it codes run+mode as 00 00 xx yy zz, 00 xx yy or xx number (i.e. if first one or two bytes are zero, read one or two bytes of the number too) with low bit signalling run or skip and the rest being its length (e.g. 55 means skip 42 bytes while 00 01 00 01 means repeating 01 128 times). But since the format is next to impossible to detect statically (no magic constants in the header, so you’d better check packet structure instead), it’ll remain a useless curiosity—just like I love it.

And most of the time I spent on writing MOV muxer. Since I’ve written one for na_eofdec already (raw output only) I had some base to start off at least. Of course it’s wonky and even copying data may or may not work depending on your luck, at least now I have a format to output variable framerate content losslessly (previously the only alternative for VFR content in NihAV was to encode data with RealVideo 4 and Cook). At least now, after all possible enhancements, I can either re-mux old-style MOV into a new one (and the output may actually be playable) or use nihav-encoder -i input.mov -o output.mov --profile raw to decode some more exotic format not supported by anything else (like MoviePak in MacBinary MOV) to a raw format that (hopefully) more tools can understand. I can also use my Cinepak or Indeo 3 encoders there (and Truemotion 1 too in the future just for lulz) and I should probably write encoders for other common QT codecs (even if there are other implementations, up to SVQ1 by The Multimedia Mike). This should keep me entertained for a while. And who knows, maybe it will serve as a base for MP4 muxer if the need for one arises.

That’s it for now, hopefully I’ll have more to write about later.

More boring stuff in NihAV

Thursday, March 12th, 2026

Since the last time I’ve done some more boring stuff for NihAV, more precisely 10-bit H.264 decoder and extended MOV support.

The former is more or less straightforward but it made me regret Rust lack of text-level macro processing. The problem is generating almost identical code for various combinations of bitdepths that differ only by the clipping range and function name suffixes. Probably it can be solved with proc macros but that’s a hole I’m unwilling to jump into. Also supporting both 8- and 16-bit decoding in one module is annoying, so I ended up just putting them next to each other and selecting one for the current mode of operation. This also uncovered the fact that my scaler did not handle high-depth YUV in the most cases, so I had to fix it too. And the ironic thing is that I don’t have enough content to watch in that format (and a couple of sample files I have are Matroska with FLAC audio, which can’t be remuxed to MP4, so there’s that).

In other rather equally useless things, I’ve finally made a MOV muxer for na_eofdec. Of course it’s a simplified one, supporting only raw video and audio tracks, but even that was annoying to support properly (I’m still not sure if I do support it properly). At least when the bitterness passes I should be able to make a MOV muxer for NihAV (and who knows, maybe even implement some QT encoders to exercise it).

But that’s not all regarding MOV. In a recent post I mentioned the oldest known MOV flavours. Finally I’ve managed to extend my MOV demuxer to support beta version of it (it’s still MOV but with a few different details here and there) while alpha version was easier to support in a different demuxer written from scratch. If you’re curious, those files are short clips (about 15 seconds usually) compressed with QT RLE demonstrating some features of upcoming System 7 in rather symbolic form. Here’s a GIF of virtual memory feature presentation.

That’s about it for now, I hope to do more exciting things next. For instance, add some Amiga formats to na_eofdec. And most importantly, Paul told me about new RAD Audio codec which I still have to look at thoroughly. From a first glance while audio in Bink is a lot like MPEG Audio Layer II (i.e. simply quantise and write coefficients with a fixed amount of bits), this new codec is more akin to AAC LC as it apparently has long/short frames division and (at least two) Huffman tables for coefficients. Another change is that Bink audio was stored in simple container formats with fixed structure, new format looks more like an elementary stream with as few bits as possible spent on coding different fields. Let’s see how it turns out…

A word about some JPEG-based codecs

Wednesday, March 4th, 2026

As I mentioned previously, I don’t want to work on game codecs for a while, so I picked other stuff instead. For instance, I’ve fixed a bug in my Indeo 3 encoder (which nobody uses, myself included), refactored some NihAV code and added a couple of decoders.

First, there is PDQ2 decoder. It actually turned out to be slightly more interesting than I expected as there’s a version of the codec for Saturn version of the game (it’s not that different yet somewhat different nevertheless).

Then I’ve finally managed to figure out MoviePak. After locating a decompressor binary and trying to disassemble it as raw M68k code I could locate enough code and data to figure out that since it uses JPEG codebooks it’s likely to be based on JPEG. And after learning a bit more about how it works and NOPing out A-line _StripAddress calls (that was the most interesting part actually—you can’t find it without knowing terminology and even then just barely; I was lucky to try searching for information about 68881, learning terminology and finally finding a reference of Macintosh Toolbox calls) I managed to make Ghidra happy enough to produce a usable decompilation. After that it was just a question of re-using JPEG decoder with a slightly different header format, which finally made me do a refactoring of JPEG decoding.

And since I’ve factored out common JPEG decoding support for MoviePak, Radius Studio and actual motion JPEG decoder, I decided to add a video codec for effects used in proDAD Adorage video editor (its FOURCC is pDAD, quite surprisingly). This one could be reverse engineered just by looking at the frame data. Each frame consists of 20-byte header, JPEG image plus alpha channel coded as PNG or JPEG. Since I have not bothered to write a motion PNG decoder (it’s very uncommon after all and I’m not NIHing image library—at least not yet), I decode just the first part. Maybe I’ll add a PNG decoder support and re-visit this decoder for proper alpha support, but for now this seems unlikely.

I’ll keep working on some boring stuff nobody cares about but maybe I’ll have a codec or two to write about as well.

Random NihAV news

Sunday, February 8th, 2026

As I mentioned previously, after making 0.5.0 na_game_tool release I’d rather work on something else for a change (if at all), so here are some bits about what I’ve been doing since:

  • first of all, I’ve implemented support for AV, a Polished AVI format. By that I mean the format comes apparently from a Polish company and it’s simplified remux of AVI data: first there’s BITMAPINFOHEADER (and optionally palette) followed by video stream header (starting with the familiar vids tag), then there may be WAVEFORMAT structure with auds stream header, then you have table of contents (video and audio frame sizes plus flags) followed by video and audio chunks. The format was trivial to guess and I’ve added support for it because why not;
  • I’ve also finally implemented functions for reading arrays of integers. First I’ve introduced them to na_game_tool and tried them in some decoders, and then ported them to the main NihAV codebase. The idea behind it is that reading data directly into destination array (with optional byte-swapping) is faster than reading data, re-interpreting it as an integer and finally copying it into the destination buffer. I had a specific version of it implemented in MP4 demuxer already (because otherwise opening a 2-3 hour long video would cause a noticeable delay) but overall it’s nicer to have just one call instead of a loop;
  • in other things, I’ve re-started na_eofdec development using the current na_game_tool codebase. That does not mean I’m starting developing it (or going to do so in the nearest feature) but at least when I eventually get to it, I can add some archive extraction modules as well. Beside that it should be a sandbox for testing MOV muxer that I want to write sooner or later (kinda like OpenDML AVI muxer in NihAV is a backport of one from na_game_tool). Of course it would need to get a couple more formats to test it on but I’m not in a hurry;
  • speaking of MOV, I’ve improved support for MOV in MacBinary II. So far I’ve seen four flavours of it: essentially flat MOV with MacBin header, MOV with mdat box in data fork and moov atom in resource fork, the same but data fork containing mdat contents without size and tag, and even older format (with samples from 1990) with much flatter structure (i.e. lots of nesting chunks are not present at all). The first three are detected and handled more or less fine now, I’ll try to support it even if as a historical curiosity;
  • there was one major code refactoring in NihAV, namely I’ve put demuxing handling code into a new nihav_hlblocks crate and made all my tools use it instead of dragging local copies of it. If you wonder why this was necessary, that’s because you can have normal demuxers (that return full packets for the streams) as well as raw stream demuxers (that return pieces of data belonging to a stream and you need to form full packet using a packetiser). And of course you can have simply raw streams like MP3 that needs a packetiser but no demuxer. That’s why I’ve made a DemuxerObject (yes, very original name) that encapsulates them all and represents all of them as a normal demuxer;
  • and finally I’ve discovered that I can call SDL_UpdateYUVTexture instead of copying frame data into the texture manually. And then I discovered that it did not work with the time display (because it also wrote to the texture directly in presumption that it will update some pixels on the frame; there is a note in the documentation telling not to do so but it worked by chance before). So I’ve changed it to render a separate small RGBA texture and blit it over the frame instead—like it should’ve been done from the start. I somewhat wonder when I’ll have to adapt it all to SDL3 but apparently I can postpone it for now.

That’s all for now. There’s a lot to do, maybe some of it will actually be done.

na_game_tool 0.5.0 released

Monday, January 19th, 2026

Grab it from the usual place if you’re interested in it for some unfathomable reason. I’ll try to document the formats when I have time (and multimedia.cx server does not suffer from crawlers).

Update from January 30: I’ve uploaded a fixed version as I broke extracting old-style Cryo archives during refactoring (and before committing), so now it should work. The rest will (not) work in the same way as before.

na_game_tool: final stretch before 0.5.0

Thursday, January 15th, 2026

As I’m somewhat tired with na_game_tool development (for now), so I’m just picking bits for the release goal to do that and switch to something else (I’ve found a couple of interesting MOV files to analyse, maybe I can improve my Indeo 3 encoder and such). I’ve added support for HNM5 and HNM6 formats, reached the self-imposed limit of at least twelve original decoders and essentially I need to add 7th Level archive format and that’s it. Meanwhile I can talk about the formats I added support for, I want to add support for (but not right now) and the formats I’m hesitating to add support for.

So, supported formats first.

There’s GRN format used in Genesia game. Frames are stored raw, compressed with RLE (which looks a lot like FLI delta RLE compression but working on pixel pairs) and the data may optionally be compressed with LZSS afterwards. And if would’ve not been a format from French company without some weirdness. In this case audio stream data is interleaved with video in the simplest way: first you have 2kB header, then a certain number of 2kB sectors with initial audio data (the number is in the header), then it is 8kB of video stream followed by 2kB of audio followed by 8kB of video stream etc etc. So video frame data may suddenly have 2kB of audio inside it. Of course it was easily solved with a custom de-interleaver but you’d expect it to see in industry-standard applications and not in a game from early 1990s.

And there’s MGIF from Gates of Skeldal. Despite the name it was only inspired by GIF in the sense it uses similar LZW compression scheme but not the format (unlike HAF). Another curious detail that I haven’t seen elsewhere is using a simple pre-processing: bytes are coded as a difference to the previous ones. What made it annoying is that this prediction resets when LZW decoder resets (i.e. when the “reset dictionary” code is encountered), so you have to implement it inside LZW decompressor instead of making a pass afterwards. Still, it gets a point for the originality.

Finally there is Celestial Impact intro.dxv. This file employs RLE to compress not merely image but its palette and some additional data (it did not look like a sound to me and I have no idea what it is) as a single array.

Now for the formats I want to support in some (distant) future. Beside the usual “whatever I can find” it definitely includes AVI and SMS—two Smacker-based formats. Maybe I can implement some Cinepak-based console formats while at it.

And speaking of console formats, here’s a fun format called DDV and used in Oddworld games (or maybe just one of them). There’s a reverse-engineered implementation of the decoder and by the look of it this format encapsulates MDEC—which makes it a perfect candidate for librempeg. It has support for some MDEC-based formats already after all.

That’s it, hopefully the next post will be about the release already.

na_game_tool: more FLICy formats

Saturday, December 27th, 2025

I’ve added another bunch of formats to na_game_tool, including both old video formats and game archive support to extract some of those (and other) formats from.

For example, I’ve finally added an extractor for Conquest Earth WAD with its 16-bit FLH (a variant of FLI with RNC compression; it has been supported since version 0.2.0) but there are other variants that justify this post title:

  • Alien Virus animation—almost standard FLI with changed tag and some jitter in subchunk sizes (i.e. they may be a byte too short or a byte too long compared to the chunk size);
  • Bureau 13—here it is a super-format with chunks comprising FLI headers, FLI chunks or PCM audio—and it may be several FLIs with different resolution too. And it is put into its own archive format (which I also added an extractor for);
  • C13 for Hammer of the Gods—at first I thought it’s just yet another hack of the format but after looking closer at the data I realised it’s merely LZSS-compressed. Essentially I just hacked my FLI decoder to decompress data first and operate on decoded data instead of file in this case;
  • Stargunner FLC—like other files in the game archive it was compressed using byte-pair encoding (and it may be the only case of such method being used for compression in the wild). In this case file is split into small chunks and unused byte values are used to code pairs (of used byte values or other pairs). Adding support for it was as trivial as in the previous case, but the fun thing here is that after I figured out the decompression algorithm I found out that it’s been known and supported by various extraction tools, it’s just discmaster picked up the laziest one.

Beside that I’ve REd HUF format from Johnny Bazookatone which employs RLE compression and then further compresses video frame data with global Huffman codes. And I’ve also added support for Goosebumps: Escape from Horrorland archive among other things. The curious thing about it is that such archives contain .gvd files which are essentially remuxed AVIs, so my plug-in reconstructs AVI from them by default (but this can be disabled if you prefer the original files instead). Almost all of the files use Indeo 3 but a couple of them seem to use their own codec with no decoder available so here’s that.

In general I still have at least four original formats to add (plus HNM6 and some extractors) but there seems to be enough candidate games to RE for it to be feasible.

P.S. I also looked at SMV format from AGON: The Lost Sword of Toledo but after looking at the binary specification and discovering MPEG-4 ASP decoder there (also if you remove the SMV header it can be played as an elementary MPEG-4 ASP stream) I lost any interest. Maybe Paul will have some interest supporting it in librempeg, maybe not—and I definitely don’t want to mess with such format (the same applies to KSV as well).

These weeks in na_game_tool

Sunday, December 14th, 2025

Last time I talked about MVI formats, mentioning that I had one more equally German MVI format. Well, the official specification uses CauseWay DOS extender which compresses executables (and I still haven’t found a way to make DosBox debugger dump loaded 32-bit segments, otherwise I would’ve REd a bunch of Psygnosis formats too). Luckily the demo version did not use it and I was able to find how it is coded. Apparently they employed some non-standard LZW which for some reason shuffled low codes, so 0/1/2 are used as special signals—dictionary bump (followed by byte aligning for some reason; other implementations bump dictionary size implicitly and do not insert bits), dictionary reset and EOF correspondingly—while codes 3..257 are used to code bytes. Beside that it’s nothing special: intra frames are (optionally) LZW-compressed, inter frames consist of pixels and mask telling where on frame to update those pixels (both parts with optional LZW compression).

Speaking of LZW, there was this FLK format (apparently Italian) that also employed LZW to compress pixel data along with the command stream with rather simple commands like “repeat pixel N times, skip next M pixels, copy following P pixels” and additionally “restore Q pixels of the background”. As you can guess, the format is used for animations overlaid on something else in addition to video clips. In case of the latter command stream may be absent and you simply put decompressed pixels into a new frame.

Other codecs are Ascon SKS (which is essentially a collection of JPEGs with swapped chroma planes; I did it mostly as a test of JPEG decoder module that I plan to use in other decoders) and Interactive Pictures VID (which I encountered in .evd, .fvd and .gvd files—for files with English version of the video, French version, and general version without speech). This one is curious not only because its binary specification for some reason contains compression for it (which I encountered first) but for the format features itself. Images are split into 8×8 blocks that may be coded raw, with 2-16 colours and a pattern, and a custom-scan RLE I remember seeing in Bink (and XCF but like Bink there are 16 patterns here as well and not just four). Probably nobody cares about it but it was fun to discover.

In addition to that I’ve implemented support for a couple of well-known formats, namely HNM5 (aka UBB2—UBB is apparently the same but lacks headers) and RoQ. The former is a typical French format (you can tell it by the fact it uses crazy motion compensation modes with mirroring and transposing), I implemented it mostly for completeness sake (so I have support for all HNM flavours in my tool, all is left is HNM6). The latter has the same rationale as when I looked at it four years ago: its support in libavformat/libavcodec sucks i.e. it’s adequate for videos intended for id Tech 4 engine but not for the original Trilobyte games (not handling JPEG-compressed frames and not descending into 0x1030 chunk are two most glaring deficiencies). Plus there is a can of worms related to the fact that audio in Trilobyte games uses unsigned predictor while id Tech games use signed predictor and to the fact that for some files you need to scale motion vector twice (and for some files it should be done on a picture scaled twice vertically). I try my best to detect such situations but they do not offer work correctly (and if you wonder, in games engine may have a hard-coded list of files that should be treated differently).

And that is actually not all! I’m still working on supporting some game archives as well (for example, the majority of RoQ files I encountered were hidden in .gjd archives, same as VDX files). One interesting example is Escal compressor. One funny aspect of it is that it is employed in Pumuckl Klabauterjagd, a game for about 8-year olds, while the rest of titles using it are definitely 18+. From technical point it’s interesting as it seems to be inspired by the dynamic variant of deflate with some simplifications: instead of blocks you have codebook definition valid only for the next specified amount of symbols (which are still combined literals and copy symbols, with their codebook coded using code lengths packed with another codebook). Overall it’s distinct enough but the source of inspiration is obvious too.

Overall it means I have only four more original decoders to create plus some of the archive formats to support (like Coktel Vision STK/STK2 as well as the previously mentioned extractor/converter for Monty Python & the Quest for Holy Grail). Probably it will be the last release too as I’m not sure I’ll have enough game formats for another release. Well, there’s also na_eofdec which should get more Amiga formats before 0.1.0 release. And in theory there are console formats if Paul has not written decoders for them all