A small rant about compression

October 8th, 2025

The recent news about OpenZL made me think about some tangential issue.

The approach by itself is nothing new really, a lot of archivers include pre-processing step for data (I don’t know if there are an earlier examples, but de-interleaving or delta-coding floating-point data might be only slightly younger than geo file in the Calgary Corpus, LZX includes translating call addresses into absolute offset for better compression etc); more advanced archivers implement flexible processing steps (e.g. RAR had its own custom VM for pre-processing data which was essentially a security nightmare cut-down 8086 instruction set, and ZPAQ which allows to define compression steps for data-specific compression that won’t require a new decoder—in other words, something very similar to OpenZL). There’s nothing wrong with the approach and it’s probably useful outside, say, genomic data compression, it’s just it raises two questions: what is the current general compression/resources accepted trade-off and what would be a good candidate for an open-source archiver?

The first question is obvious: with the times the available CPU power and RAM grows along with the amounts of data to compress. Back in the day gzip was the golden standard and bzip2 was something eating too much RAM and worked rather slow. A bit later .tar.bz2 started to replace .tgz for, say, distribution tarballs. Nowadays it’s .tar.xz or .tar.zstd, which makes me wonder if it’s really the sweet spot for now or if things will move to adapting a compression scheme that’s slower but offers better compression ratio.

The second question follows from the first one: what would be a good candidate, specifically for open-source applications. If you look around, there are not so many of those. You can divide existing formats (don’t confuse them with implementations) into several (sometimes overlapping) categories:

  • proprietary formats with an official open-source decoder at best (like RAR) or unofficial reverse-engineered one (e.g. RAD mythical sea creatures formats and LZNIB);
  • open-source compression libraries targeting fast compression (LZO, LZ4, FLZ, LZF, etc, etc);
  • old open-source compressors (compress, gzip, bzip2, zip);
  • various programs trying to bank on well-known name while not being related (bzip3 and anything with “zip” in its name really);
  • state-of-the-art compressors that require insane amounts of CPU and RAM (anything PAQ-based, NNCP);
  • corporate-controlled open-source formats (brotli, Zstandard).

The question is what would be a good candidate for the next de-facto compression standard. The current widespread formats are good since they’re easy to implement and there are many independent implementations in various languages, but how much can we trust the next generation—the one with flexible input pre-processing (the third question would be if that’s really the design approach mainstream compression formats will take).

For instance, I have nothing against LZMA but considering that its author is russian how much can we trust that he won’t be visited by FAPSI representatives and forced to make changes in LZMA3 design that will make Jia Tan green with envy? As for the formats coming from corporations, are you really going to rely on their goodwill? I think the story with LZW should serve as a warning.

The only reassuring thing is that it is still rather easy to design a new compression scheme and even achieve decent compression ratio and performance (unlike training a neural network or even designing a video codec to rival H.265), so good candidates are likely to appear sooner or later.

“AI” is not there to help you

October 2nd, 2025

I’m not writing this post to convince somebody, I write it mostly to formulate my thoughts and so I can refer to it later saying “called it”.

First of all, what do I have against AI and why the first word of the title is in quotes? Not much, actually, it’s just what gets hyped as AI nowadays is far from it—hence the quotes. It can do something, sometimes it can do it good—but in general it is far from being intelligence.

IMO it’s more accurate to call it artificial managers, since they do what your typical manager does: spew completely meaningless bullshit, take your work and reword it in corporate-speak way, plagiarise somebody’s work and take credit for it. Also maybe it’s acceptable for typical USian not to ever learn, but normally it is expected from human to keep learning and re-evaluating things throughout whole life. Of course I’m no AI scientist (and so my opinion does not matter) but I believe that proper AI should have two feedback loops: an inner loop that controls what is being done, and an outer loop that adjusts knowledge based on the new experience. Inner feedback loop means that while executing the task you’re trying to understand what you got, how it relates to the goal, and then you adjust what you’re doing if necessary. It’s like in a famous joke about the difference between physicists and mathematicians being asked to boil a kettle when it’s full and on the oven already: physicist will simply light a match and light fire, mathematician will take that kettle off the oven and pour water out, thus reducing the task to the well-known one. Outer feedback loop means learning from the experience. For example, LLMs apparently still make the same mistake as small children on answering what is larger, 4.9 or 4.71; unlike small children they don’t learn from it, so next time they will give the same answer or make the same mistake on some other numbers. I reckon implementing both such loops is feasible even if the inner loop will require a magnitude more of resources (for reverse engineering its own output, calculating some metric for deviation from goal and re-doing it again if needed), the outer loop is much worse since it would mean going over the knowledge base (model weights, whatever) and adjusting it (by reinforcing some parts and demoting or even deleting others).

So if I believe it can be improved why I claim it’s not helpful? What I’m saying is that while in current state it still may be useful for you, it is not being developed to make your life easier. It should be obvious that developing such system takes an enormous effort—all the input data to collect and process let alone R&D and learning control—so it’s something that can be done only by a large community or a large company (often stealing results of the former). And companies do something not to advance human well-being but rather to get profit, “dishonestly, if we can; honestly if we must” (bonus points for recognising what sketch this quote is from). I consider the current situation to be a kind of arms race: somebody managed to convince somebody that AI will be an ultimate solution, so the company that gets first practical solution will get an extreme advantage over competitors—thus current multi-billion budgets are spent mostly on fear of missing out.

What follows from the fact that AI is being developed by large companies in pursuit of commercial interests? Only that its goal is not to provide free service but rather to return investments and make profit. And profit from replacing expensive workforce is much higher (and real) compared to what you might get from just offering some service to random users (especially if you do it for free). Hence the apt observation that “AI” takes creative (i.e. highly-paid) work instead of house chores while people would rather have it the other way round.

As the result if the things go the way the companies that develop AI want, a lot of people will be rather superfluous. There will be no need for the developers, there will be no need for people doing menial tasks like giving information, performing moderation and such (we can observe that even now to large extent). There will be no reasons for those free-to-play games either as non-paying players there are just to create background for whales (called so because they spend insane amounts of money on the game). Essentially the whole world will be like Web of Bullshit with people being rather a nuisance.

Of course it is just an attempt to model how events will develop based on incomplete data. Yet I remain an optimist and expect humanity to drive itself to an early grave before AI will pose any serious threat.

New obscure formats

September 27th, 2025

Despite how it looks, I still monitor Discmaster for new additions in hope there’s something interesting there. Sometimes there is, which I can either postpone for later or actually take a look and try to figure out how it works. Here’s a list of stuff I looked at and found at least somewhat interesting:

  • beta version of VfW SDK contained a special AVI file that had a different structure and apparently can contain only single stream. I added a support for it to NihAV just for completeness sake;
  • ReVoice Studio discs contain some AVD files that are AVI files in reality. The problem there is that those files seem to employ Indeo feature for content protection and require an access key to decrypt data. For rather obvious reasons it’s not something I’m willing to pursue further;
  • some Licensed Cartoon Property Activity Center discs contain videos that use ARBC codec. I looked at it long time ago at Paul’s request so I remember he wrote a decoder for it. But it turned out that there’s a version of the codec used in MOV—with the 16-bit values being big-endian now. So I also implemented a decoder for both codec flavours just for completeness sake;
  • Video Toaster 1.0 (now for Windows, who cares about Amiga system-seller?) had some samples in RTV format. It turned out to be uncompressed interlaced video in packed format. I’ve implemented a decoder for it in na_eofdec;
  • speaking of Amiga, there’s a game called Golem with animations in XFL format (that are raw frames in per-bitplane format). Those are not too interesting to support but there’s also a stand-alone video player featuring some game footage and its XFL has a proper format, with audio and palettes. So I supported it in na_eofdec (since it’s not strictly game format).

There is at least a dozen of other formats that I found by searching for large unknown files, so currently there’s enough work waiting for me (maybe I’ll actually do something eventually too).

na_eofdec finally started

September 14th, 2025

While librempeg does its awesome stuff, I’ve finally started working on na_eofdec, a new tool for decoding exotic/obscure formats (hence the name). It is intended to decode fringe formats I find interesting enough to write a decoder but not useful enough to be included into main NihAV (maybe I’ll move some formats from there to this tool as I did previously with some game formats and na_game_tool).

And while it is based on na_game_tool and will keep its interface, there’s one major technical difference under the hood: while game formats are expected to produce constant rate content (always the same number of frames per second and audio blocks of equal size), these formats are allowed to have variable framerate and audio block length. Currently it affects only AVI writer (which I modified to have synchronisation, frame duplication and splitting audio input into blocks of equal length) but in the future I hope to write a MOV muxer to handle such inputs natively.

Of course such tool is useless without decoders, so I’ve added a pair of them for Lantern MOV formats. These are a pair of RLE-based animation formats using IFF or RIFF structure (little-endian in either case). There are more candidates out there, like all those IFF-based formats. As usual, the first release will happen when I implement at least a dozen of original decoders, so it will take a while.

Proto-Indeo revisited

September 6th, 2025

In my last post I mentioned DVI family of formats and I decided to extend NihAV support a bit. Previously I implemented YULELOG.AVS demuxing and decoding and stopped at it, but apparently there are six more samples that can be found with discmaster.textfiles.com (fun fact: SAMPLE.AVP is not detected as AVSS and out of four instances three are unknown and one got the embedded JPEG file decoded).

There are certain difficulties extending support beside the original file: a good deal of the samples have AVSS format slightly different from the open specification, AVS2AVI.EXE convertor refuses to convert all but two files (saying that it does not know the algorithm used to compress them), the other available tools seem to rely on the ActionMedia card so you can’t do much without it.

So here’s the list of all known samples with some notes about them:

  1. AUDM400.AVS—single audio track that uses “dvaud44” audio compression, which is some variation of DVI ADPCM. I have a suspicion that its audio packets are interleaved by channel (i.e. audio packet 0 is left channel data, packet 1 is right channel data, packet 2 is left channel data gain) but I’m not going to introduce some horrible hacks to assemble audio data in this case;
  2. NWSAMP.AVS—PLV video with each component in its separate stream. Since there’s no specification available at all, I can only speculate that it employs delta compression akin to YVU9C or even something closer to Indeo 3;
  3. REEL400.AVS—RTV2 video with empty audio track;
  4. SAMPLE.AVP—AVSS “image” format (you don’t think WebP was the first case of such formats, do you?) with a stream containing single JPEG frame. In YUV410 format too, so I had to modify my decoder to handle it (along with fixing the case when planes are sent in separate scans);
  5. SAMPLE.AVS—RTV2 video with (silent) DVI ADPCM audio. Initially video stream could not be decoded until I discovered it uses custom codes (that change between frames; and this is apparently the older version of RTV2 too). Now it works;
  6. video.avs—RTV2 video with DVI ADPCM audio;
  7. YULELOG.AVS—single RTV2 stream.

And since I’ve mentioned custom RTV2 codes, here’s how they work: there are codes with certain property and fixed symbol mapping (for all 143 symbols), so there’s a compact way to describe such codes. Each RTV2/Indeo2 frame has eight bytes in the header with the code description. Codes consist of two parts: unary prefix and fixed-size part, with the code descriptor providing the size of fixed part for each prefix. So e.g. code description 2,3,3 will map to 0xx, 10xxx, 1110xxx codes while description 4,1,2 will map to 0xxxx, 10x, 110xx codes. It’s not the most effective coding scheme but it takes little space and easy to implement fast decoding (you can use pre-computed look-up tables and just calculate what range of codes corresponds to which prefix). The scheme got employed again in Indeo 4 and 5.

This concludes my explorations in DVI/Indeo formats (because I don’t expect more information to resurface). There are still more formats to look at though.

First Indeo codecs

August 30th, 2025

Recently I’ve posted a short review of DPCM-based video codecs where Indeo 2 and 3 were mentioned, but what about Indeo 1?

Previously I believed that it’s their raw format aka IF09 (YVU 4:1:0 with 7 bits per component) but recently I’ve discovered a codec called Indeo YVU9 Compressed, which kinda fills the gap between raw video and comparatively complex Indeo 2 (which employs not merely delta coding but also vector quantisation and zero-run coding).

This format codes intra-only frames plane per plane with simple delta prediction and fixed Huffman codes for small deltas plus escape value (which means full 8-bit code value should be read). In other words, a perfect initial DPCM-based codec which can be improved in different ways.

I cannot tell if this codec really deserves to be called Indeo 1 (relegating IF09 to Indeo 0) or it’s some simplification of Indeo 2 that came later. As you know, Indeo codecs come from DVI (no, not the display interfaces) and they had different names. From what I can tell there were three video codec families there: RTV (real-time video), PLV (production-level video, not as fast) and PIC (whatever that is). RTV2 is now known as Indeo 2 but it’s hard to tell which one was Indeo 1 (if there was any) or YVU9C. What’s worse is that there’s next to no software specifications for DVI, you were supposed to use special cards with Intel chipset to encode and decode it.

In either case, it’s yet another codec reverse engineered.

A quick glance at another bunch of codecs

August 23rd, 2025

Since I can’t do anything but look at various codecs, I did exactly that. So here are some details about codecs nobody cares about.

First, I looked at a video codec used in videos (movies and TV series) for certain hand-held console. Despite it coming from Majesco, video data start with VXGB magic, reminding of a certain other codec for a slightly newer hand-held console with its data starting with VXDS. Structurally it’s very close to it as well, being simplified H.264 rip-off. My REing efforts were thwarted by the binary specification organisation: while code is supposed to reside in data segment, it constantly calls functions from fast RAM area with no apparent place where they are initialised. I expect it to be some of that code being duplicated there for performance reasons but I haven’t found the place where that copying is performed. Oh well, nobody cares about the format anyway, why should I be an exception?

Then, there’s a whole family of Pixar codecs. The Toy Story game made by them relied on a bunch of QuickTime codecs made by them. There are decoders provided for pix0pix7 and pixA codecs while the game itself seems to have content only in pix0, pix3, pix4, pix5, pix7 and pixA formats. The binary specification causes Ghidra decompilation failures (mostly in the functions responsible for the decoding) so I could figure something out and something is left as an exercise to a masochist the reader.

All codecs are paletted and most of them operate on 4×4 tiles. Pixar codecs 0 and 4 are actually raw format (with data re-arranged into 4×4 tiles). Codecs 3 and 5 are similar, they maintain a list of tiles (transmitted in the beginning of the frame; frame can update some tiles in the list) and image data is coded as a series of opcodes meaning “draw tile number N”, “leave next N tiles unchanged” or “restore next N tiles from the background image” (that image is stored in some other file, likely compressed with codec 0 or 4). Codec 7 seems to employ static Huffman coding (and I don’t know much beside that fact). Codec A looks like some kind of RLE but I may be wrong.

P.S. I also started some code re-organisation and improvement. For example, I finally got rid of ByteReader/ByteWriter wrappers over I/O objects so it’s less boilerplate code—but unfortunately I’ll need to convert the existing codebase to the new way. I’ve done that for main NihAV repositories but na_game_tool is not yet updated. And I fear I’ll need to waste some time fixing and extending my MPEG-4 ASP decoder (so it can play all videos from my collection). All this leaves not so much time for researching (very) old codecs.

When video DPCM codecs roamed the Earth

August 16th, 2025

Back in mid-1990s there was a situation when video on computers was slowly getting in demand while the processing power was still very limited. Old paletted video formats were slowly going away (Smacker still holding strong though) and for hi-colour video RLE was not a good choice in terms of compression and DCT was not a good choice in terms of CPU cycles required. Thus the niche was partially filled by block-based vector quantisation codecs (like Cinepak) and various codecs that compressed the difference between previous and current pixel in some way (hence DPCM name even if it’s more related to audio compression).

So today I’m going to give a review of these codecs and how Motion Pixels fit in.
Read the rest of this entry »

MVI2: done

August 14th, 2025

I’m almost done with Motion Pixels at last. Of course I skipped implementing some exotic modes but at least the files I could find play fine and don’t complain about missing modes. I just need to put finishing touches and commit it all, probably on Saturday.

The next post should be dedicated to intricate details of the codec(s) and comparison to its better-known competitors with similar design (Truemotion 1/RT/2/2X and Indeo 2/3) but for now all I need to say that frames may be coded in several modes (RGB or YUV with one chroma sample per 2×1, 2×2, 4×2 or 4×4 block), some parts of it may be use low-resolution delta coding (with its own peculiarities depending on line number and sampling mode); and since that was not enough, they’ve added smooth delta coding mode (which also has its own peculiarities in low-resolution coding mode). And of course there’s single-field coding mode. And some features seem to be duplicated using different flags. Since I’ve not found any samples for most of them, I simply implemented basic modes, 4×4 YUV mode with lowres and all YUV modes with optional lowres and smooth delta coding (since movieCD samples seem to exercise them all).

The best part is that nobody cares. NihAV can’t be interfaced with MPlayer easily, discmaster.textfiles.com is not likely to change anything (e.g. files here are recognised as aviAudio type despite having video track and nihav-encoder being able to decode it just fine. Or BATH06.MOV—no link since it’s the only one in the database—which can be converted with the same tool but it’s not even recognised as QT MOV. So I don’t expect that MVI1/2 files will get a video track either.) And I never was Aware caring about the codec much, not having any content coded with it for starters.

Anyway, with this waste of time is over, so what’s next? While searching for the samples I’ve found a couple other MVI formats that may be good candidates for na_game_tool. There is a lot of janitorial work for NihAV as well (for example, I want to rewrite AVI demuxer—it’s one of the first pieces of code I implemented for the project and now I see that some things could’ve been done differently and better). And I’ve finally decided on a name for a new tool: na_eofdec (NihAV exotic/obscure formats decoder). Now all is left is to RE and implement enough formats for a release of both of those tools.

Don’t expect any of this happening soon though, I am lazy and work on it only when I’m in the mood. For example, this post might’ve been about why wavelet compression for video (and images) inherently sucks—but I still haven’t got in a proper mood for writing it.

MVI2: some news

August 8th, 2025

First of all, here’s some information for the context: MVI codecs rely on out-of-band flags to signal what capabilities and subsampling they use (the fact that they decided to store those flags in FOURCC is a different annoyance); and despite the potential variety, only couple of flags are used for each codec. For instance, of all MVI1 files I saw only one flag has been in use (golden frame—and it’s only in one game). MVI2 has two distinct sets of flag combinations, 0x804 and 0x200. The former means bog standard MVI coding (with one chroma sample set for 4×4 block) plus one extension feature, the latter means MVI2 version 2 (if that makes any sense) where they decided to make subsampling and features selectable per frame (as well as adding more of them) and moved them to the frame header while at it.

So far I concentrated my efforts on format 0x804 to see what the feature it is. It turned out to be low-resolution deltas, just like Truemotion 2. In this mode every odd pixel is coded as previous pixel plus half of luma delta for the next pixel. I still have to make the reconstruction run properly, but that’s nothing a lot of debugging can’t fix.

This should allow me to decode even some of MovieCD samples (including the one hidden in samples.mplayerhq.hu/drivers32 for some reason) and I’ve seen quite recognizable frames already.

It’s hard to tell what features the other flavour uses but it’s reasonable to assume that it uses lowres coding as well. Hopefully I’ll get to it soon.

Update from Saturday: after dealing with the annoyance of different deltas coding scheme per each line type, I can now decode the few files I could find (including a couple of movieCDs from archive.org) just fine. The second half seems to use an alternative composing/rendering functions and reads maps differently as well. So while it’ll take more time, at least I’m closer to completion.