Archive for the ‘Various Video Codecs’ Category

Cinepak’s long-lost brother?

Monday, April 20th, 2026

While discmaster is stagnating (you know whom to thank for the shortages of HDDs as well as RAM), I still look through stuff there in hopes to find something interesting. Occasionally I manage to stumble upon something special indeed.

This time it was navigable movies bundled with QuickTime 1.5 or so. Apparently the idea behind them is that all frames are actually tiles of a much larger picture that user can navigate without exhausting all RAM trying to decode it as one contiguous image. If you thought about ISO H.EIC (aka HEIF or AVIf depending on intraframe codec employed) you may be right, but also it got re-branded (and maybe enhanced a bit?) some time later as QT-VR (sometimes I think no matter how stupid modern multimedia idea is, it’s been implemented in QuickTime a couple decades ago).

Anyway, out of four such movies, one was recognised and converted by discmaster software, two were playable (with my player) after I hacked file type tag to be MooV instead of APPL, the last one could not be decoded at all because it was of an unknown type.

Luckily for me resource data of that movie contained the decoder in m68k binary format (sometimes I think no matter how stupid modern multimedia idea is, it’s been implemented in QuickTime a couple decades ago—or did I say that already?). And just by looking at the frame contents I knew it was worth REing as I could spot YUV codebook right at the beginning and it was definitely not Cinepak (or Compact Video as it was known back then). The name was “CDROM Video codec” with tag cdvc but that didn’t tell me much. The file was created in 1992 while Wickedpedia claims that SuperMac Compact Video was added to QT around that time as well.

Anyway, let’s move to the format details. The codec starts with 24-byte header followed by YUV or (theoretically) RGB24 codebook, the another 24-byte header (containing frame dimensions among other things) and finally data. Frame is split into 4×4 blocks and first there is an opcode sent containing block type and number of blocks (minus one) of that type. Blocks are known to have three types: 4 vectors per blocks, 1 vector per block (scaled 2x), or simply skip.

The concepts of codebook-based coding are the same as in Cinepak, even YUV conversion formula is almost the same (with simplified coefficients using multiplication/division by two only). The main difference is using just one codebook for everything and coding format—while Cinepak uses separate bit masks for block types, this codec uses opcodes (which is common for other fruity codecs). So this makes me wonder where this codec comes from and how it is related to Compact Video. Was it some kind of predecessor? Was it developed by Malus as a competition or based on the licensed technology? Why was it abandoned?

Even if I ended up with more questions, it was still a fun way to spend a Sunday weekend (the rest of Sunday was spent travelling to/from Lower Ulm and it’s a differently fun way to spend Sundays; but that’s not the point here). Who knows, with a new search approach I may be able to uncover a couple more of ancient codecs to look at.

P.S. Another fun codec for very early QuickTime was SIVQ (that’s how it was called, I’ve failed to find anything but the decoder for it). It was simply 128-entry 2×2 codebook (in RGB24 format) followed by codebook indices. Probably the name stands for “SImple Vector Quantisation”. That makes it the third proper VQ codec in QT (SMC is a slightly different beast; and RPZA is the first texture codec instead).

NihAV: OS/2 multimedia support

Friday, April 17th, 2026

In theory I should be documenting the codes Paul has shared with me or MVS (did you know that it employs a rather interesting chroma subsampling method—coding three 8×8 blocks in a macroblock but chroma samples have less than a half of coefficients in zigzag order coded) but instead I’ll write about something nobody really cares about.

As some of you may know, RedHat had (at least) two multimedia formats developed: RLE-based PhotoMotion for RedHat PC (later licensed to American Laser Games that seems to extend it somewhat) and gradient-based UltiMotion codec for AVI (the format of choice for VfOS/2).

Since the codec is somewhat unique, I decided to write an encoder for it. This way I can re-encode e.g. some movieCD (a format hardly anybody remembers) to another obscure format nobody remembers exactly just because I can. But the main reason is to learn how it’s organised.

There are three distinctive features it has: shared chroma (i.e. 8×8 super-block can have just one pair of chroma samples instead of coding each 4×4 block with its own pair), quantised values (6-bit luma and 4-bit chroma) and of course gradients. Actually there are seven block coding modes and only half of them are gradient-based—the rest are more conventional skip block, scaled-down block (4 luma samples only), BTC (2 samples plus fill pattern) and raw block.

Gradients here are essentially filling the block in one of the direction a lot like intra prediction works in H.264 and later, the main differences being fewer angles and fill values being transmitted explicitly. Fun thing is that unlike other codecs there’s no easy way to transmit a flat block, you need to code it in some extended way as the simplest (“shallow”) mode codes a coarse gradient with two values, the second value being implicit N+1. More complicated (“LTC”) mode codes a fine gradient but allows only a four-colour combination present in 4096-entry codebook. There’s also an extended mode where you can code any values for a gradient.

This of course poses a challenge of finding a good gradient in reasonable time (because trying all 4096 combinations with all 16 directions may get a bit slow). For shallow coding it’s easier, you essentially have block split into two parts, so checking averages for those parts and seeing if they fit is enough. For LTC and extended mode I applied a similar trick by finding the averages of four samples used in each gradient angle and saw if they fit well enough (for LTC it was also checking that the samples are monotone increasing and checking only the more or less close codebook entries; probably there’s more of optimisation potential but I’m fine with it as is).

Actually I started it gradually: first by implementing simple “raw or skip blocks only” mode, then “any block type that does not introduce additional loss (beside YUV quantisation)” mode, then lossy mode, and finally fast-and-shitty mode. The idea behind the last one is to calculate blocks variance and to use that information to force block selection process (i.e. not try more complex block types on blocks with low variance). As you can guess from the name it did not work out that nice (but it was about twice as fast). But overall lossy mode works rather good and by introducing distortion thresholds I can vary output file size significantly (3-5 times smaller and still not being a block soup). I’m not going to bother with any rate control and overall I consider this experiment done.

In conclusion I’d like to write something but nothing comes to my mind. Stay tuned for more stuff nobody cares about (like obscure codecs or my experiments with palettisation).

On fruity MVS codec

Saturday, April 11th, 2026

I could be writing about RedHat video encoder I just finished or work on REing DiVID1x on Paul’s request, but this was earlier in my queue.

Apparently on iVNC protocol there’s an option to use a custom iCodec for that. Since I was asked to look at it, here are my preliminary findings (more detailed bitstream description will follow eventually).

So packet starts with a byte telling payload type (0 – intra frame, 1 – inter frame, 2 – custom quantisation matrices for luma and chroma, 64 bytes each). After that the rest of data follows.

Intra frames code a series of tiles with tile metadata and actual tile content being separated into different parts. Frame data starts with two DCT quantisers followed by 24-bit big-endian metadata part size, then there’s metadata, and finally it’s tile data.

Tile metadata codes 3-bit tile type and the number of tiles having that type (00001110 mean 1-15 tiles, 11110 means next 8-bit value plus sixteen, 111110 means next 15-bit value plus sixteen, 111110 means next 22-bit value plus sixteen). Tiles can be of the following types:

  • white tile—tile is completely filled white;
  • last match—previous(?) tile is copied;
  • upper match—tile above(?) is copied;
  • black and white—one bit per pixel (0 – black, 1 – white);
  • two-colour tile—almost the same but with two colours transmitted first (8-bit luma and 6-bit chroma values);
  • DCT—tile data is coded with ProRes-like DCT;
  • match tile—re-paint last recently used DCT tile;
  • cached tile—re-paint DCT tile with 16-bit index from LRU cache.

Inter frames start with two bytes telling the number of coded chroma coefficients and the rest is single bitstream with 2-bit tile type and whatever tile content is stored. Tile types are: skip, DCT, match tile, and cached tile. The first type should be obvious, the rest is probably the same as in intra frames.

Also frame data is supposed to end with "mvs\0" but I guess this matters only for people trying to write a compatible encoder (or checking that the data was decoded correctly).

See, it’s a rather simple codec, so hopefully I’ll clarify some things (like cache behaviour, YUV coefficients and actual DCT bitstream format), document it at The Wiki and move to something else.

(Ir)regular NihAV news

Monday, March 30th, 2026

Here’s another portion of news about what I’ve been doing in last couple of weeks. Hopefully I’ll be able to write about fruity MVS soon.

While most of what I did is not worth mentioning, there are some bits I find funny or curious enough to share.

First of all, I’ve added VAXL and (Clari)SSA decoding support to na_eofdec (just some more and there will be enough formats for 0.1.0 release). And I actually added them mostly to test an IFF parser (derived from my AVI parser and enhanced). The idea is simple: just point it to the data needed to be parsed and provide chunk handlers and it should do the trick. The main enhancement is also providing a condition when to stop—at the end, on any unknown chunk, before some pre-defined chunk, or right after such chunk has been parsed. This way I can control parsing process without relying too much on some special callbacks or states. For example, in VAXL that means I parse header by expecting VXHD chunk and nothing else. The rest of VAXL consists of time code, palette, image data, and audio samples chunks; there I simply stop after BMAP chunk and output reconstructed frame (and audio if present). In SSA (IFF variant, I added FEE one too for completeness sake) each frame is wrapped into FORM DLTA list so there I just parse sub-chunks ignoring unknown ones.

Then I’ve finally had another look at TCA and improved my decoder, going from less than half of samples I had being supported to virtually all of them. While I presumed that video is always compressed with LZW, in reality it also has RLE and raw modes. RLE mode is rather curious at that as it codes run+mode as 00 00 xx yy zz, 00 xx yy or xx number (i.e. if first one or two bytes are zero, read one or two bytes of the number too) with low bit signalling run or skip and the rest being its length (e.g. 55 means skip 42 bytes while 00 01 00 01 means repeating 01 128 times). But since the format is next to impossible to detect statically (no magic constants in the header, so you’d better check packet structure instead), it’ll remain a useless curiosity—just like I love it.

And most of the time I spent on writing MOV muxer. Since I’ve written one for na_eofdec already (raw output only) I had some base to start off at least. Of course it’s wonky and even copying data may or may not work depending on your luck, at least now I have a format to output variable framerate content losslessly (previously the only alternative for VFR content in NihAV was to encode data with RealVideo 4 and Cook). At least now, after all possible enhancements, I can either re-mux old-style MOV into a new one (and the output may actually be playable) or use nihav-encoder -i input.mov -o output.mov --profile raw to decode some more exotic format not supported by anything else (like MoviePak in MacBinary MOV) to a raw format that (hopefully) more tools can understand. I can also use my Cinepak or Indeo 3 encoders there (and Truemotion 1 too in the future just for lulz) and I should probably write encoders for other common QT codecs (even if there are other implementations, up to SVQ1 by The Multimedia Mike). This should keep me entertained for a while. And who knows, maybe it will serve as a base for MP4 muxer if the need for one arises.

That’s it for now, hopefully I’ll have more to write about later.

A word about some JPEG-based codecs

Wednesday, March 4th, 2026

As I mentioned previously, I don’t want to work on game codecs for a while, so I picked other stuff instead. For instance, I’ve fixed a bug in my Indeo 3 encoder (which nobody uses, myself included), refactored some NihAV code and added a couple of decoders.

First, there is PDQ2 decoder. It actually turned out to be slightly more interesting than I expected as there’s a version of the codec for Saturn version of the game (it’s not that different yet somewhat different nevertheless).

Then I’ve finally managed to figure out MoviePak. After locating a decompressor binary and trying to disassemble it as raw M68k code I could locate enough code and data to figure out that since it uses JPEG codebooks it’s likely to be based on JPEG. And after learning a bit more about how it works and NOPing out A-line _StripAddress calls (that was the most interesting part actually—you can’t find it without knowing terminology and even then just barely; I was lucky to try searching for information about 68881, learning terminology and finally finding a reference of Macintosh Toolbox calls) I managed to make Ghidra happy enough to produce a usable decompilation. After that it was just a question of re-using JPEG decoder with a slightly different header format, which finally made me do a refactoring of JPEG decoding.

And since I’ve factored out common JPEG decoding support for MoviePak, Radius Studio and actual motion JPEG decoder, I decided to add a video codec for effects used in proDAD Adorage video editor (its FOURCC is pDAD, quite surprisingly). This one could be reverse engineered just by looking at the frame data. Each frame consists of 20-byte header, JPEG image plus alpha channel coded as PNG or JPEG. Since I have not bothered to write a motion PNG decoder (it’s very uncommon after all and I’m not NIHing image library—at least not yet), I decode just the first part. Maybe I’ll add a PNG decoder support and re-visit this decoder for proper alpha support, but for now this seems unlikely.

I’ll keep working on some boring stuff nobody cares about but maybe I’ll have a codec or two to write about as well.

Some words about the oldest QuickTime

Thursday, February 19th, 2026

Since apparently discmaster.textfiles.com ran out of space (because storage is not cheap now thanks to “AI” companies), I have no new formats to be distracted with and have to resort to looking at the old ones.

As we all know, A**le single-handedly invented multimedia and the existence of IFF ANIM (that pre-dates it by a couple of years) should not confuse you. From what I could find, the earliest QuickTime samples can be found on Apple Reference & Presentation Library 8 CD-ROM that was intended for demonstration but not for the consumers. That disc actually contains three flavours of QT MOV: ordinary MOVs that can be decoded without problems, slightly older MOV that looks almost the same but has slightly different format (and features version -1 both in the header and in the metadata strings) and some extremely old files that do not correspond to the usual MOV structure. Those apparently come from the alpha version of QuickTime around year 1990 when it was still known by its code-name after USian artist of Ukrainian origin.

The first glaring difference is that atoms are not strictly nested but may contain some data beforehand. Essentially all atoms that normally have special header atoms have it before the rest. So instead of (pardon my S-expressions)

(moov (mvhd (trak tkhd (mdia mdhd (minf …))))

there is

(moov [moov header data] (trak [trak header data] (mdia [mdia header data] [all codec description data])))

Data structure is flatter too. In release QT version data for the individual tracks is grouped into chunks, so the header describes both the chunks and what tracks use what chunks (and what are frame sizes in the chunk). Here frames are stored as (offset, size) pairs.

From the accompanying decoder I can see it supported raw video, RLE and motion JPEG at the time—fancy vector quantisation codecs came later.

I hope to support it one day but there’s no hurry. Currently I’m more or less done improving NihAV tools and want to add some practical demuxers, muxers (and maybe even impractical encoders like for Indeo 4 or RedHat Ultimotion). And there’s na_eofdec still waiting to collect enough supported formats for its first release (tangential fun fact: recently I’ve added support for Electric Image format there which is mostly a simple RLE but its frames apparently have timestamps in floating-point format like 0.266 or 0.5).

P.S. I’ve also discovered MoviePak QT codec which looks like JPEG variant. Maybe one day when I’ll have nothing better to do I’ll look closer at it, but for now I have no desire to reverse engineer M68k code in unknown format.

More codecs to look at

Wednesday, February 11th, 2026

Here’s a list of video codecs I found with the help of discmaster.textfiles.com (mostly by looking for .inf files with “vidc” in them). Maybe I’ll look at them one day, maybe somebody will beat me to it. In either case, it’s a nice reminder for myself.

  • CSMX.dll—reportedly used for CSM0 codec, was bundled with some player, no known samples. Considering its small size it’s likely to be either raw YUV format or lossless codec;
  • d3dgearcodec.dll—D3DGear lossless codec;
  • elsaeqkx.dll—some Elsa quick codec;
  • Esdll.dll—bundled with the same player, the strings inside it hint on it being an unholy mix of ZIP and MELP-based speech codec;
  • ICS422.DRVS422 or SuperMatch YUV 4:2:2 codec (I suspect it’s raw YUV);
  • ICSMS0.DRVSMS0 or SuperMatch VideoSpigot codec (from SuperMac, later bought by Radius);
  • MCTCOD.DRV—MCTPlus Draw driver, supposedly offering 2:1 compression and has a bunch of FOURCCs registered to it: DRAW, MC16, MC24, MR16, MR24, MY16, MY24
  • MyFlashZip0.axMFZ0 lossless codec. Looks like ordinary deflate wrapper really;
  • NTCodec.dll—NewTek nt00 codec (I expect lossless or an intermediate codec). Looks like a simple fixed-fields packing;
  • RGBACodec.dll—apparently Lightworks lossless RGBA codec. Simple RLE inside;
  • Sx73p32.dll—Lucent Technologies SX7300P speech codec;
  • TRICODC.DRV—Trident draw codec with a bunch of compressed RGB and YUV formats: rtc3, rtc5, rtc6, ty0n, ty2d, ty2n, ty2c, r0y1… Compressed in this context means packed using fixed-length bitfields;
  • UCLZSS.DRV—Ulead LZSS codec aka uclz (obviously a simple lossless codec). As expected, it’s yuyv data compressed with LZSS;
  • UCYUVC.DRV—Ulead compressed YUV 411 aka yuvc. Apparently it’s just 8/12/16-bpp YUV (fixed packing scheme, and 16-bit is simply YUYV/YUY2 and such);
  • V422.DRV—Vitec Multimedia V422. Most likely simply raw YUV;
  • V655.DRV—Vitec Multimedia V655 (I’ll venture a guess this means 6-bit luma and 5-bit chroma instead of improbable YUV 6:5:5 subsampling);
  • VDCT.DRV—Vitec Multimedia VDCT.

Oh, and I’ve also found Eloquent elvid32.dll for EL02 FOURCC but it’s different since it has samples and it’s yet another H.263 rip-off.

That’s it for now, I’ll talk about FLAC video some other time.

Quicker look at QuickTime

Sunday, February 1st, 2026

Since I’ve done looking at game formats for a while (I’ve updated na_game_tool 0.5.0 release with the fixed extraction of old-style Cryo Interactive archives BTW), I decided to look at my backlog of unsupported (either mis-detected as audio-only or not detected at all) MOV files from discmaster instead. Of course the majority of them are files missing their resource fork (and majority of the rest are poorly-recognised data+resource forks in MacBinary II format; that reminds me I should improve support for it), and majority of the rest are files that I support already (like Duck Truemotion 1 or Eidos Escape codecs). And a good deal of the rest are Flash in MOV (not going to touch it). And yet there are a couple of files worth talking about.

  • 911602.MOV—remarkable for having JPEG frame embedded in QuickDraw stream. It made me finally write a decoder for it (but without JPEG support, maybe one day…);
  • a couple of .flm files turned out to have obfuscated moov atom (with something almost, but not quite, entirely unlike TEA if you care). Not something I’d like to support;
  • La88110mN1Hz2_20-07.mov—it turned out to be raw YUV420 codec;
  • Omni_324.mov—compressed apparently by MicroWavelet on NeXT. Intriguing but I doubt anything else can be found about that codec;
  • Music Chase .tmv—a very weird little-endian MOV format with custom video codec (I think I mentioned it before but I haven’t progressed since);
  • there was one Flip4Mac-produced sample (which name eludes me), but considering that it stores ASF packets inside MOV it’s not something anybody is eager to support;
  • some undecodeable SVQ3 files—apparently they are encrypted;
  • and finally there are some QT Sprite files, which seem to be (often deflate-compressed) frames containing several atoms with commands and occasionally image data as well. Sounds too hairy to support but again, maybe one day…

With AVI situation is somewhat better, there are some poorly formatted and encrypted/obfuscated files as well (plus damaged files from AOL archives that start at random position, so contents may be AVI file with some additional garbage in the beginning or missing few dozens kilobytes of initial data). Beside that I’ll probably implement PDQ2 decoder just for completeness sake. Eventually.

Some words about DMV format

Friday, November 14th, 2025

I said it before and I repeat it again: I like older formats for trying more ideas than modern ones. DMV was surprisingly full of unconventional approaches to video compression.

First of all, it uses fixed-size blocks with palette and image data occupying initial part of the block, and audio taking fixed size tail part of that block.

Then, there’s image compression itself: frame is split into 2×2 tiles that can be painted using 1-4 colours using pre-defined patterns, so only colours and pattern indices are transmitted. Those indices are further compressed with LZ77-like method that has only “emit literals” and “copy previous” (with source and destination allowed to overlap but with at least 4-byte gap for more efficient copying).

And even that is not all! I actually lied a bit because in reality it is 2x2x2 tiles since one pattern (using up to 8 colours) codes two 2×2 blocks in two consequent frames simultaneously. Now that’s an idea you don’t see every day. There are many codecs that code colours and patterns separately (like Smacker or System Shock video), some use a variation of LZ77 to compress data even further, but the only other codec I can name that coded two frames at once would be H.263+ with its PB-frames (and it was mostly interleaving data in order to avoid frame reordering, not sharing common coding data between frames).

Searching discmaster.textfiles.com for large unrecognised files returned more hits so I hope that some of the remaining formats will also feature some refreshingly crazy ideas. Now I only need time and mood for reverse engineering.

Quickly about VC2

Wednesday, November 12th, 2025

As I mentioned some time ago, Paul has shared a decoder for bytedanceVC2 codec. From the first glance it looked like H.266 rip-off. After a more thorough looking it seems to be more or less plain H.266.

There are some tell-tale signs of codecs using well-known technologies: simply listing strings in the binary gives you “VPS/SPS/PPS/APS malloc failed” (with APS being adaptation parameter set, which was not present in H.EVC but is present in H.266), “affine merge index is out of range” (again, affine motion was added in H.266), “lmcs_dec” (chroma-from-luma, another feature not picked up by H.265 but which was fine for H.266. Update: apparently it does something to the luma as well, still it’s H.266 thing), “alf_luma_num_filters_signalled_minus1” (a variable straight from the specification). I hope that this is convincing enough to claim that the codec is based on H.266 (or VVC).

Now what would a rip-off codec change? In my experience such codecs keep general structure but replace entropy coding (either completely or at least change codebooks) and DSP routines (e.g. for H.264 rip-offs it was common to replace plane intra prediction mode with something similar but not exactly the same; motion compensation routines are the other popular candidate).

I cannot say I got too deep into it but the overall decoding is rather straightforward to understand and from what I saw at least CABAC was left untouched: it’s exactly the same as in the standard and the model weights are also exactly the same (at least the initial part). Of course that does not preclude it having differences in DSP routines but I seriously doubt it being different from the standard (or some draft of it).

And with that I conclude my looking at it. Those with more motivation and actual content to decode are welcome to try decoding it, I have neither.