Archive for the ‘Various Video Codecs’ Category

We have FLI at home

Friday, May 29th, 2026

Recently I’ve released na_eofdec, a tool for decoding exotic and/or obscure formats. That release included F16 format support, but recently I’ve REd another one (PC Animate Plus / 3D WorkShop animation) and there’s yet another one waiting in the queue (Reflections animation). What unites them all is that they all employ simple compression schemes (mostly RLE-based) and (beside F16) they all have a 3D modelling program associated with them. And I’ve REd these formats by investigating the file format, they’re that simple.

F16 posed itself as “FLI but 16-bit” and it looks like its creators have failed to build an ecosystem out of it. I have encountered merely two demo samples at discmaster and nothing else. From technical point of view it’s either uncompressed intra frames or the rather familiar FLI delta compression scheme with its number of skip/run opcodes per line, just with small variations.

PC Animate Plus is more interesting as it has 4-, 8- and 16-bit content compressed, at least two format versions and several compression schemes. Plus it has some additional chunks for complex operations and even metadata telling e.g. which Voc file to play along. Intra frames are RLE-compressed, inter frames usually employ FLI delta compression (with small changes of course) but there’s another mode consisting of offsets and chunks of data to update. Another interesting thing is that it does not replace old pixels but XORs them with new data instead (maybe it comes from an alternative universe where 80×86 had REP XORMOVSB instruction).

And there’s Reflections animation. While I haven’t written a decoder for it yet, I can describe it already. I’m aware of three samples with rather uncommon 320×256 resolution and big-endian format. First frame is uncompressed, the rest seems to be simply “skip N 32-bit words, update M 32-bit words”. Writing a decoder for it should not be that hard… An update from the next day: it’s simple RLE with opcodes being skip/repeat/copy but the actual data is stored in 4-pixel columns format, so e.g. copying 8 pixel quads will result in 4×8 rectangle.

Individually the formats are nothing to write about but together they form a group of FLI clones that poses some interest. Now that I’m done with MVS it’s either extending QuickTime support in NihAV or REing obscure formats, hopefully it will give me more material for my writing.

NihAV: QT support enhancements

Friday, May 8th, 2026

When I have enough inspiration, I improve NihAV. When I don’t (which is more common state to me), I RE codecs or write blog posts—so here’s one.

First of all, I’ve started adding non-raw encoders for some common QT formats. It’s not that there are no open-source encoders for them, but I do them mostly to find out how it is done and maybe learn something new in the process. For instance, RLE encoding combines skips, runs and pixel copies; this rises the question of optimal encoding as sometimes it may be cheaper to encode a whole area as new pixels instead of a mix of copy+skip+copy. So I’ve implemented a greedy approach (i.e. code longest skip or run and fall back to encoding raw if those two fail) as well as slow but optimal one. It’s a variation of trellis coding: just calculate encoding cost with each mode (skip/run/raw) to all next possible positions and if it’s lower than the existing one, use that mode; at the end simply trace back the decisions that gave least cost at the end and encode them in right order.

Then I also added RPZA encoder. This is essentially the first texture codec before GPUs with the need for texture compression, its main compression mode is encoding 4×4 block with four colours where two colours are linearly interpolated from two explicitly transmitted colours. There is no apparent way on how to do it fast, so I ended up with an extremely simplified scheme: first I calculate the maximum difference between components and pick the one with the largest difference (or code block as single-colour if it’s small enough) to decide what values to pick, then I calculate explicit colours from an average of input pixels close to minimum and maximum ends of that range. I also have a refinement step by running vector quantisation loop to adjust the ends but it’s rarely needed in practice.

There are still more encoders to implement (SMC, SVQ1, IMA ADPCM and MACE) but none of them is interesting beside SVQ1, so probably I’ll write about it when/if I ever get to implementing an encoder for it (it does not matter if The Multimedia Mike has done that over fifteen years ago—NIH is there in the project name for a reason).

Now, surprisingly enough I’ve improved decoding support as well. The original QuickTime had SIVQ codec which is a straightforward 256-entry codebook for 2×2 RGB24 tiles followed by codebook indices. I had read its binary specification some time ago and recently I was able to locate (probably the only existing) sample for it, which is a good reason to write a decoder for it. It was well-spent five minutes of my time. Maybe in the future I’ll also do something about Pixar codecs (Ghidra works better with raw m68k version of the decoder than with 16-bit Windows 3.x version of the same).

And finally I’ve improved the support for multi-descriptor MOV files. I mentioned it some time ago and I got bitten by it again recently. For example, alice_lo_m.mov from samples.mplayerhq.hu got just first frame decoded for me and many QuickTime 1.5 sample videos (with its developers) gave an error on the last frame. For the former it’s because first frame is JPEG and the rest of them are SVQ1, while the latter samples are coded with Cinepak but the last frame may be a special one encoded with RPZA. And there was another file fully encoded with RPZA—but with the majority of it being 160×120 while last dozen of frames or so were 320×240. So I finally got annoyed enough to implement multiple streams per track so at least the frames get marshalled to the correct decoder, even if it leads to the partial streams being rather unusable. Maybe one day I’ll write a tool which will walk through MOV and render all tracks in correct sequence (taking edit lists into account), scaling and adjusting playback rate as needed, producing a raw MOV file that can be played without special hacks; or maybe I don’t hate myself that much.

That’s it for now, don’t expect anything soon (MVS description may appear but who’s waiting for that?).

Cinepak’s long-lost brother?

Monday, April 20th, 2026

While discmaster is stagnating (you know whom to thank for the shortages of HDDs as well as RAM), I still look through stuff there in hopes to find something interesting. Occasionally I manage to stumble upon something special indeed.

This time it was navigable movies bundled with QuickTime 1.5 or so. Apparently the idea behind them is that all frames are actually tiles of a much larger picture that user can navigate without exhausting all RAM trying to decode it as one contiguous image. If you thought about ISO H.EIC (aka HEIF or AVIf depending on intraframe codec employed) you may be right, but also it got re-branded (and maybe enhanced a bit?) some time later as QT-VR (sometimes I think no matter how stupid modern multimedia idea is, it’s been implemented in QuickTime a couple decades ago).

Anyway, out of four such movies, one was recognised and converted by discmaster software, two were playable (with my player) after I hacked file type tag to be MooV instead of APPL, the last one could not be decoded at all because it was of an unknown type.

Luckily for me resource data of that movie contained the decoder in m68k binary format (sometimes I think no matter how stupid modern multimedia idea is, it’s been implemented in QuickTime a couple decades ago—or did I say that already?). And just by looking at the frame contents I knew it was worth REing as I could spot YUV codebook right at the beginning and it was definitely not Cinepak (or Compact Video as it was known back then). The name was “CDROM Video codec” with tag cdvc but that didn’t tell me much. The file was created in 1992 while Wickedpedia claims that SuperMac Compact Video was added to QT around that time as well.

Anyway, let’s move to the format details. The codec starts with 24-byte header followed by YUV or (theoretically) RGB24 codebook, the another 24-byte header (containing frame dimensions among other things) and finally data. Frame is split into 4×4 blocks and first there is an opcode sent containing block type and number of blocks (minus one) of that type. Blocks are known to have three types: 4 vectors per blocks, 1 vector per block (scaled 2x), or simply skip.

The concepts of codebook-based coding are the same as in Cinepak, even YUV conversion formula is almost the same (with simplified coefficients using multiplication/division by two only). The main difference is using just one codebook for everything and coding format—while Cinepak uses separate bit masks for block types, this codec uses opcodes (which is common for other fruity codecs). So this makes me wonder where this codec comes from and how it is related to Compact Video. Was it some kind of predecessor? Was it developed by Malus as a competition or based on the licensed technology? Why was it abandoned?

Even if I ended up with more questions, it was still a fun way to spend a Sunday weekend (the rest of Sunday was spent travelling to/from Lower Ulm and it’s a differently fun way to spend Sundays; but that’s not the point here). Who knows, with a new search approach I may be able to uncover a couple more of ancient codecs to look at.

P.S. Another fun codec for very early QuickTime was SIVQ (that’s how it was called, I’ve failed to find anything but the decoder for it). It was simply 128-entry 2×2 codebook (in RGB24 format) followed by codebook indices. Probably the name stands for “SImple Vector Quantisation”. That makes it the third proper VQ codec in QT (SMC is a slightly different beast; and RPZA is the first texture codec instead).

NihAV: OS/2 multimedia support

Friday, April 17th, 2026

In theory I should be documenting the codes Paul has shared with me or MVS (did you know that it employs a rather interesting chroma subsampling method—coding three 8×8 blocks in a macroblock but chroma samples have less than a half of coefficients in zigzag order coded) but instead I’ll write about something nobody really cares about.

As some of you may know, RedHat had (at least) two multimedia formats developed: RLE-based PhotoMotion for RedHat PC (later licensed to American Laser Games that seems to extend it somewhat) and gradient-based UltiMotion codec for AVI (the format of choice for VfOS/2).

Since the codec is somewhat unique, I decided to write an encoder for it. This way I can re-encode e.g. some movieCD (a format hardly anybody remembers) to another obscure format nobody remembers exactly just because I can. But the main reason is to learn how it’s organised.

There are three distinctive features it has: shared chroma (i.e. 8×8 super-block can have just one pair of chroma samples instead of coding each 4×4 block with its own pair), quantised values (6-bit luma and 4-bit chroma) and of course gradients. Actually there are seven block coding modes and only half of them are gradient-based—the rest are more conventional skip block, scaled-down block (4 luma samples only), BTC (2 samples plus fill pattern) and raw block.

Gradients here are essentially filling the block in one of the direction a lot like intra prediction works in H.264 and later, the main differences being fewer angles and fill values being transmitted explicitly. Fun thing is that unlike other codecs there’s no easy way to transmit a flat block, you need to code it in some extended way as the simplest (“shallow”) mode codes a coarse gradient with two values, the second value being implicit N+1. More complicated (“LTC”) mode codes a fine gradient but allows only a four-colour combination present in 4096-entry codebook. There’s also an extended mode where you can code any values for a gradient.

This of course poses a challenge of finding a good gradient in reasonable time (because trying all 4096 combinations with all 16 directions may get a bit slow). For shallow coding it’s easier, you essentially have block split into two parts, so checking averages for those parts and seeing if they fit is enough. For LTC and extended mode I applied a similar trick by finding the averages of four samples used in each gradient angle and saw if they fit well enough (for LTC it was also checking that the samples are monotone increasing and checking only the more or less close codebook entries; probably there’s more of optimisation potential but I’m fine with it as is).

Actually I started it gradually: first by implementing simple “raw or skip blocks only” mode, then “any block type that does not introduce additional loss (beside YUV quantisation)” mode, then lossy mode, and finally fast-and-shitty mode. The idea behind the last one is to calculate blocks variance and to use that information to force block selection process (i.e. not try more complex block types on blocks with low variance). As you can guess from the name it did not work out that nice (but it was about twice as fast). But overall lossy mode works rather good and by introducing distortion thresholds I can vary output file size significantly (3-5 times smaller and still not being a block soup). I’m not going to bother with any rate control and overall I consider this experiment done.

In conclusion I’d like to write something but nothing comes to my mind. Stay tuned for more stuff nobody cares about (like obscure codecs or my experiments with palettisation).

On fruity MVS codec

Saturday, April 11th, 2026

I could be writing about RedHat video encoder I just finished or work on REing DiVID1x on Paul’s request, but this was earlier in my queue.

Apparently on iVNC protocol there’s an option to use a custom iCodec for that. Since I was asked to look at it, here are my preliminary findings (more detailed bitstream description will follow eventually).

So packet starts with a byte telling payload type (0 – intra frame, 1 – inter frame, 2 – custom quantisation matrices for luma and chroma, 64 bytes each). After that the rest of data follows.

Intra frames code a series of tiles with tile metadata and actual tile content being separated into different parts. Frame data starts with two DCT quantisers followed by 24-bit big-endian metadata part size, then there’s metadata, and finally it’s tile data.

Tile metadata codes 3-bit tile type and the number of tiles having that type (00001110 mean 1-15 tiles, 11110 means next 8-bit value plus sixteen, 111110 means next 15-bit value plus sixteen, 111110 means next 22-bit value plus sixteen). Tiles can be of the following types:

  • white tile—tile is completely filled white;
  • last match—previous(?) tile is copied;
  • upper match—tile above(?) is copied;
  • black and white—one bit per pixel (0 – black, 1 – white);
  • two-colour tile—almost the same but with two colours transmitted first (8-bit luma and 6-bit chroma values);
  • DCT—tile data is coded with ProRes-like DCT;
  • match tile—re-paint last recently used DCT tile;
  • cached tile—re-paint DCT tile with 16-bit index from LRU cache.

Inter frames start with two bytes telling the number of coded chroma coefficients and the rest is single bitstream with 2-bit tile type and whatever tile content is stored. Tile types are: skip, DCT, match tile, and cached tile. The first type should be obvious, the rest is probably the same as in intra frames.

Also frame data is supposed to end with "mvs\0" but I guess this matters only for people trying to write a compatible encoder (or checking that the data was decoded correctly).

See, it’s a rather simple codec, so hopefully I’ll clarify some things (like cache behaviour, YUV coefficients and actual DCT bitstream format), document it at The Wiki and move to something else.

(Ir)regular NihAV news

Monday, March 30th, 2026

Here’s another portion of news about what I’ve been doing in last couple of weeks. Hopefully I’ll be able to write about fruity MVS soon.

While most of what I did is not worth mentioning, there are some bits I find funny or curious enough to share.

First of all, I’ve added VAXL and (Clari)SSA decoding support to na_eofdec (just some more and there will be enough formats for 0.1.0 release). And I actually added them mostly to test an IFF parser (derived from my AVI parser and enhanced). The idea is simple: just point it to the data needed to be parsed and provide chunk handlers and it should do the trick. The main enhancement is also providing a condition when to stop—at the end, on any unknown chunk, before some pre-defined chunk, or right after such chunk has been parsed. This way I can control parsing process without relying too much on some special callbacks or states. For example, in VAXL that means I parse header by expecting VXHD chunk and nothing else. The rest of VAXL consists of time code, palette, image data, and audio samples chunks; there I simply stop after BMAP chunk and output reconstructed frame (and audio if present). In SSA (IFF variant, I added FEE one too for completeness sake) each frame is wrapped into FORM DLTA list so there I just parse sub-chunks ignoring unknown ones.

Then I’ve finally had another look at TCA and improved my decoder, going from less than half of samples I had being supported to virtually all of them. While I presumed that video is always compressed with LZW, in reality it also has RLE and raw modes. RLE mode is rather curious at that as it codes run+mode as 00 00 xx yy zz, 00 xx yy or xx number (i.e. if first one or two bytes are zero, read one or two bytes of the number too) with low bit signalling run or skip and the rest being its length (e.g. 55 means skip 42 bytes while 00 01 00 01 means repeating 01 128 times). But since the format is next to impossible to detect statically (no magic constants in the header, so you’d better check packet structure instead), it’ll remain a useless curiosity—just like I love it.

And most of the time I spent on writing MOV muxer. Since I’ve written one for na_eofdec already (raw output only) I had some base to start off at least. Of course it’s wonky and even copying data may or may not work depending on your luck, at least now I have a format to output variable framerate content losslessly (previously the only alternative for VFR content in NihAV was to encode data with RealVideo 4 and Cook). At least now, after all possible enhancements, I can either re-mux old-style MOV into a new one (and the output may actually be playable) or use nihav-encoder -i input.mov -o output.mov --profile raw to decode some more exotic format not supported by anything else (like MoviePak in MacBinary MOV) to a raw format that (hopefully) more tools can understand. I can also use my Cinepak or Indeo 3 encoders there (and Truemotion 1 too in the future just for lulz) and I should probably write encoders for other common QT codecs (even if there are other implementations, up to SVQ1 by The Multimedia Mike). This should keep me entertained for a while. And who knows, maybe it will serve as a base for MP4 muxer if the need for one arises.

That’s it for now, hopefully I’ll have more to write about later.

A word about some JPEG-based codecs

Wednesday, March 4th, 2026

As I mentioned previously, I don’t want to work on game codecs for a while, so I picked other stuff instead. For instance, I’ve fixed a bug in my Indeo 3 encoder (which nobody uses, myself included), refactored some NihAV code and added a couple of decoders.

First, there is PDQ2 decoder. It actually turned out to be slightly more interesting than I expected as there’s a version of the codec for Saturn version of the game (it’s not that different yet somewhat different nevertheless).

Then I’ve finally managed to figure out MoviePak. After locating a decompressor binary and trying to disassemble it as raw M68k code I could locate enough code and data to figure out that since it uses JPEG codebooks it’s likely to be based on JPEG. And after learning a bit more about how it works and NOPing out A-line _StripAddress calls (that was the most interesting part actually—you can’t find it without knowing terminology and even then just barely; I was lucky to try searching for information about 68881, learning terminology and finally finding a reference of Macintosh Toolbox calls) I managed to make Ghidra happy enough to produce a usable decompilation. After that it was just a question of re-using JPEG decoder with a slightly different header format, which finally made me do a refactoring of JPEG decoding.

And since I’ve factored out common JPEG decoding support for MoviePak, Radius Studio and actual motion JPEG decoder, I decided to add a video codec for effects used in proDAD Adorage video editor (its FOURCC is pDAD, quite surprisingly). This one could be reverse engineered just by looking at the frame data. Each frame consists of 20-byte header, JPEG image plus alpha channel coded as PNG or JPEG. Since I have not bothered to write a motion PNG decoder (it’s very uncommon after all and I’m not NIHing image library—at least not yet), I decode just the first part. Maybe I’ll add a PNG decoder support and re-visit this decoder for proper alpha support, but for now this seems unlikely.

I’ll keep working on some boring stuff nobody cares about but maybe I’ll have a codec or two to write about as well.

Some words about the oldest QuickTime

Thursday, February 19th, 2026

Since apparently discmaster.textfiles.com ran out of space (because storage is not cheap now thanks to “AI” companies), I have no new formats to be distracted with and have to resort to looking at the old ones.

As we all know, A**le single-handedly invented multimedia and the existence of IFF ANIM (that pre-dates it by a couple of years) should not confuse you. From what I could find, the earliest QuickTime samples can be found on Apple Reference & Presentation Library 8 CD-ROM that was intended for demonstration but not for the consumers. That disc actually contains three flavours of QT MOV: ordinary MOVs that can be decoded without problems, slightly older MOV that looks almost the same but has slightly different format (and features version -1 both in the header and in the metadata strings) and some extremely old files that do not correspond to the usual MOV structure. Those apparently come from the alpha version of QuickTime around year 1990 when it was still known by its code-name after USian artist of Ukrainian origin.

The first glaring difference is that atoms are not strictly nested but may contain some data beforehand. Essentially all atoms that normally have special header atoms have it before the rest. So instead of (pardon my S-expressions)

(moov (mvhd (trak tkhd (mdia mdhd (minf …))))

there is

(moov [moov header data] (trak [trak header data] (mdia [mdia header data] [all codec description data])))

Data structure is flatter too. In release QT version data for the individual tracks is grouped into chunks, so the header describes both the chunks and what tracks use what chunks (and what are frame sizes in the chunk). Here frames are stored as (offset, size) pairs.

From the accompanying decoder I can see it supported raw video, RLE and motion JPEG at the time—fancy vector quantisation codecs came later.

I hope to support it one day but there’s no hurry. Currently I’m more or less done improving NihAV tools and want to add some practical demuxers, muxers (and maybe even impractical encoders like for Indeo 4 or RedHat Ultimotion). And there’s na_eofdec still waiting to collect enough supported formats for its first release (tangential fun fact: recently I’ve added support for Electric Image format there which is mostly a simple RLE but its frames apparently have timestamps in floating-point format like 0.266 or 0.5).

P.S. I’ve also discovered MoviePak QT codec which looks like JPEG variant. Maybe one day when I’ll have nothing better to do I’ll look closer at it, but for now I have no desire to reverse engineer M68k code in unknown format.

More codecs to look at

Wednesday, February 11th, 2026

Here’s a list of video codecs I found with the help of discmaster.textfiles.com (mostly by looking for .inf files with “vidc” in them). Maybe I’ll look at them one day, maybe somebody will beat me to it. In either case, it’s a nice reminder for myself.

  • CSMX.dll—reportedly used for CSM0 codec, was bundled with some player, no known samples. Considering its small size it’s likely to be either raw YUV format or lossless codec;
  • d3dgearcodec.dll—D3DGear lossless codec;
  • elsaeqkx.dll—some Elsa quick codec;
  • Esdll.dll—bundled with the same player, the strings inside it hint on it being an unholy mix of ZIP and MELP-based speech codec;
  • ICS422.DRVS422 or SuperMatch YUV 4:2:2 codec (I suspect it’s raw YUV);
  • ICSMS0.DRVSMS0 or SuperMatch VideoSpigot codec (from SuperMac, later bought by Radius);
  • MCTCOD.DRV—MCTPlus Draw driver, supposedly offering 2:1 compression and has a bunch of FOURCCs registered to it: DRAW, MC16, MC24, MR16, MR24, MY16, MY24
  • MyFlashZip0.axMFZ0 lossless codec. Looks like ordinary deflate wrapper really;
  • NTCodec.dll—NewTek nt00 codec (I expect lossless or an intermediate codec). Looks like a simple fixed-fields packing;
  • RGBACodec.dll—apparently Lightworks lossless RGBA codec. Simple RLE inside;
  • Sx73p32.dll—Lucent Technologies SX7300P speech codec;
  • TRICODC.DRV—Trident draw codec with a bunch of compressed RGB and YUV formats: rtc3, rtc5, rtc6, ty0n, ty2d, ty2n, ty2c, r0y1… Compressed in this context means packed using fixed-length bitfields;
  • UCLZSS.DRV—Ulead LZSS codec aka uclz (obviously a simple lossless codec). As expected, it’s yuyv data compressed with LZSS;
  • UCYUVC.DRV—Ulead compressed YUV 411 aka yuvc. Apparently it’s just 8/12/16-bpp YUV (fixed packing scheme, and 16-bit is simply YUYV/YUY2 and such);
  • V422.DRV—Vitec Multimedia V422. Most likely simply raw YUV;
  • V655.DRV—Vitec Multimedia V655 (I’ll venture a guess this means 6-bit luma and 5-bit chroma instead of improbable YUV 6:5:5 subsampling);
  • VDCT.DRV—Vitec Multimedia VDCT.

Oh, and I’ve also found Eloquent elvid32.dll for EL02 FOURCC but it’s different since it has samples and it’s yet another H.263 rip-off.

That’s it for now, I’ll talk about FLAC video some other time.

Quicker look at QuickTime

Sunday, February 1st, 2026

Since I’ve done looking at game formats for a while (I’ve updated na_game_tool 0.5.0 release with the fixed extraction of old-style Cryo Interactive archives BTW), I decided to look at my backlog of unsupported (either mis-detected as audio-only or not detected at all) MOV files from discmaster instead. Of course the majority of them are files missing their resource fork (and majority of the rest are poorly-recognised data+resource forks in MacBinary II format; that reminds me I should improve support for it), and majority of the rest are files that I support already (like Duck Truemotion 1 or Eidos Escape codecs). And a good deal of the rest are Flash in MOV (not going to touch it). And yet there are a couple of files worth talking about.

  • 911602.MOV—remarkable for having JPEG frame embedded in QuickDraw stream. It made me finally write a decoder for it (but without JPEG support, maybe one day…);
  • a couple of .flm files turned out to have obfuscated moov atom (with something almost, but not quite, entirely unlike TEA if you care). Not something I’d like to support;
  • La88110mN1Hz2_20-07.mov—it turned out to be raw YUV420 codec;
  • Omni_324.mov—compressed apparently by MicroWavelet on NeXT. Intriguing but I doubt anything else can be found about that codec;
  • Music Chase .tmv—a very weird little-endian MOV format with custom video codec (I think I mentioned it before but I haven’t progressed since);
  • there was one Flip4Mac-produced sample (which name eludes me), but considering that it stores ASF packets inside MOV it’s not something anybody is eager to support;
  • some undecodeable SVQ3 files—apparently they are encrypted;
  • and finally there are some QT Sprite files, which seem to be (often deflate-compressed) frames containing several atoms with commands and occasionally image data as well. Sounds too hairy to support but again, maybe one day…

With AVI situation is somewhat better, there are some poorly formatted and encrypted/obfuscated files as well (plus damaged files from AOL archives that start at random position, so contents may be AVI file with some additional garbage in the beginning or missing few dozens kilobytes of initial data). Beside that I’ll probably implement PDQ2 decoder just for completeness sake. Eventually.