Archive for the ‘Various Video Codecs’ Category

MVI1: done

Sunday, August 3rd, 2025

In last post I wrote about how I’ve managed to reconstruct a recognizable picture for MVI1 codec. After I fixed the prediction code it started to work properly. Surprisingly, Treasure Quest game proved to be a source of MVI1 files in all formats (RGB, YUV422, YUV420, YUV410 and YUV4½0—the last one has one set of chroma samples per 4×4 block and is the most common MVI format in general). Additionally it has MVI1 samples with golden frame feature (I named it after a feature in a family of competing codecs that started with rather similar coding approach): frame 0 is two intra frames with the second frame serving as the background for the other frames; there is an additional map mode which tells that certain rectangles should be copied from the golden frame (instead of previous frame or filled with one colour). MVI2 seems to have an extension of that mode but I’ll see about it when I get to it (and if I obtain samples using that mode).

So, MVI2 next. Considering the number of extensions they added (and how they interfere with frame reconstruction) it’s probably not going to be easy but now I have a base to extend instead of blind guesses to make.

Motion Pixels: breakthrough!

Friday, August 1st, 2025

As I mentioned last month, I decided to reverse engineer Motion Pixel codecs and after a lot of failed attempts to make decoder work I’ve finally got something.

First of all, here are two frames from different videos.

MVITEST.AVI:

And a frame from SHOT09.AVI (from Apollo 18 game):

As you can see, the images are not perfect but recognizable already. And just last week it would’ve been a mess of colours with some barely recognizable distorted shapes.

The reason for this is that while MVI1 (and MVI2) is indeed based on stand-alone MVI format (I call it MVI0 for clarity), there are some nuances. On the first glance MVI0 and MVI1 are the same—all steps are the same—and indeed you can use the same code to decode data from either, but the reconstruction steps differ significantly.

Essentially there are four steps there: decode rectangles defining which parts of the frame will be left intact or filled with one colour, decode deltas used to reconstruct the rest of pixels, use those deltas to generate predictors for each line (where needed), use the rest of deltas to reconstruct the rest of pixels. Additionally MVI employs chroma subsampling mode so only one pixel in 1×1 to 4×4 block (depending on mode) has delta differences applied to chroma, all other pixels update only luma component. So if you don’t do it correctly you may end up applying e.g. deltas intended for luma to chroma components and vice versa. That’s what I got for a long time and could not understand why.

It turns out that vertical prediction has its pixel sampling at different position—or maybe it’s scrambled in the same way as line prediction. There for the most common mode (one set of chroma components per 4×4 block) each group of four lines is decoded in reverse order (i.e. 3, 2, 1, 0, 7, 6, 5, 4, …). For 2×2 block only lines in pairs are reversed. You can see artefacts of wrong prediction on Apollo frame.

Anyway, having a recognisable picture means that the hardest part (for MVI1) is done, so all is left now is to fix the remaining bugs, refactor the code and move to MVI2. There are other annoying things there but now I know how to deal with them.

BTW, if you’re curious why it takes so long, the problem is the binary specification being obfuscated to the point that Ghidra refuses to decompile most of MVI1 decoder functions and can’t do much about reconstruction functions since they’re a mess of spaghetti code (probably written in assembly language directly) so it’s more of a state machine than a decoding loop. And they abuse segment registers to access different parts of the context (and this must be the reason why it cannot work under OSes from this millennium). I got some progress when I resorted to debugging this mess by running MVI2 player OllyDbg under Win95 (emulated in DosBox-X) and constantly referring to Ghidra to see where to put breakpoint to trace a certain function. That process is definitely not fun for me but it gave results.

Overall, probably it could’ve gone better but I hope the rest won’t take as long.

Random NihAV news

Thursday, July 24th, 2025

Since I have not tweaked any weights and have not made any releases, I’ll just write about some stuff I’ve been working on but have not released yet. Meanwhile librempeg got support for a bunch new formats too so its changelog may be a more interesting read. Anyway, this post is about what I have (and haven’t) done.

First of all, I’ve finally fixed an annoying problem with VA-API decoding on one of my laptops. Counterintuitively, it turned out to be faster to request hardware to convert native surface into some other format (NV12 into YUV420) and then use it instead. This made decoder CPU usage drop under 10% at last. Probably it can be optimised further to reduce load on graphics subsystem but I’d rather not mess with OpenGL unless it’s really really really needed.

Then I expended support for two formats in na_game_tool. VDX (used in The 7th Guest) had a different format version for the game demo. It still employs two-colour VQ but data for intra frames is split into separate parts for masks and colours, and inter frames code updates to masks and/or colours for each block instead of independent decoding. Additionally thanks to Discmaster I’ve located DPEG version 2 which employs completely different algorithm from the version 3 (painting 4×4/2×2/1×1 squares for intra and skip/update for inter).

I’ve also discovered some new interesting formats like Lantern MOV (which codes DIB changes using 16-bit RLE and there’s a probably related older version in IFF instead of RIFF). I’m considering making a sister project to na_game_tool to decode various formats like this one, formats coming from Amiga, recording formats and such—for all the formats that I’d like to try decoding but don’t want in main NihAV. I’ll write about it when I actually have something to write about (i.e. when I have a name and enough formats for 0.1.0 release). Another curious find was fractal video codec—not the ClearVideo but something with fourcc FVF1 from Images Incorporated. Who knows, it may be interesting to RE.

And finally here’s what I really wasted too much time on: Motion Pixels decoders. It has rather annoying binary specification (like using segment registers to address decoder context variables) that decompilers refuse to translate and from I heard it’s impossible to run on anything newer than Windows 95 or NT4. Nevertheless the formats pose some interest.

From what I saw long time ago, MVI2 is MVI1 with certain extensions, and MVI1 is surprisingly close in the structure to MVI in its own format files—and Gregory Montoir has reverse engineered it long time ago.

So I started by reimplementing that MVI decoder (since I can debug its behaviour against known working implementation) while trying to understand what it does. I got it more or less working (reconstruction is still not perfect but at least it’s recognizable) and my decoder supports other files (found with Discmaster of course) that trigger demuxer bugs or have different subsampling modes.

Then I moved to implementing MVI1 decoder applying the differences found in the binary specification. While it still does not handle decoding properly (both the pictures are garbled and I don’t use all deltas stored in the frame), at least it proves I’m on the right way. Hopefully it’ll decode properly soon and then I can add MVI2 features. Of course it’s a useless format nobody cares about, but apparently I do.

NihAV: now with TealMovie support

Wednesday, June 11th, 2025

Back in the day I looked at the format and recently, to distract myself from game formats, I decided that it might be a good not the worst idea to implement decoding it.

And in the course of doing that I discovered some things that make it even more peculiar. For starters, it flips every second sample in its ADPCM coding. I don’t know if it improves compression in this particular case or it was done just to be different. Similarly split sub- or sub-sub-blocks are coded in コ-order instead of more traditional zigzag order.

But there are more interesting things about it. For starters, the file is organised into blocks instead of frames. First block always contains metadata (streams parameters, title, creator and such), next blocks contain one or more video frames (which you have to decode one after another; I implemented frame parsing for finding out frame boundaries but that’s inelegant solution), and last blocks are used to store audio. This means demuxer either has to demux audio frames after all video frames are sent or jump places in order to maintain synchronisation. Since this is not na_game_tool, I picked the former. The samples are short, so it’s easier to decode them to AVI+WAV and remux properly (or decode both streams to AVI and make AVI demuxer handle unsynchronised streams better—but that’s a task for another day).

Another surprising thing is that there is 16-bit RGB support, done in a very peculiar way. Frame decoding remains the same, except that now frame data is actually a pseudo-YUV frame with two chroma planes following the luma plane. And of course the conversion is done using one of two tables (depending on file version) using the formula yuv2rgbtab[(u + v) * 128 + y]. I guess it’s coding luma, colour difference and colour difference difference here.

And finally, intra frames in TealMovie are stored raw. But when frame width exceeds 160, it is stored half-size.

That’s why I’m looking at those old formats: there’s significantly more variety there in employed coding methods and storage format nuances. Not all of them make much sense but sometimes they’re entertaining or so original that it makes you wonder why such approaches got no further development.

P.S. Maybe I should take another look at the handheld console video formats.

P.P.S. But I think I’ll have to do some boring things instead. With BaidUTube changing its available formats it seems I finally need my own MP4 muxer. In either case that’s easier than to fix libav.

A tale of three formats

Saturday, April 19th, 2025

Since I have nothing better to do, I keep looking at the odd formats here and there and occasionally do something about them. Here are three formats I took a look at recently and that have nothing in common beside being video codecs.

CGDI

This is a “capture” codec (aka Camcorder Video) if not for the fact that it records rather events than actual image data. I had a suspicion right from the start that it’s organised in the same way as WMF/EMF—opcodes with parameters that invoke GDI subsystem to draw actual image—and finally I’ve decided to check that.

Of course it turned out to be true. There are about 64 opcodes in total, some are for drawing things, some are for window management stuff, and some are for GDI state management (e.g. create or delete brush, pen, font and such).

Since implementing a decoder for it would mean replicating a good deal of Windows graphics subsystem (even if you can borrow code from Wine), I consider it completely impractical and merely improved codec documentation in The Wiki.

TCA

This is an animation format used on Acorn platform. Actually it has three layers of encapsulation: there’s raw TCA format that contains only the header and video frames, then there’s TCA wrapped in ACEF chunk with an optional SOUN chunk following it (no points for guessing what it contains), and finally there’s that format put inside ARMovie (as a single block).

I added its support to NihAV just for completeness sake. Of course not all of different flavours are supported (video is mostly just plain LZW but it has some alternative coding mode and an uncompressed alternative, audio is IMA ADPCM but sometimes it’s not without any reliable way to distinguish which is which). And looks like some animations may have variable frame rate (with DIR1 subchunk likely telling frame durations). All the details are there, in raw ARM binaries and semi-compiled BBC BASIC code, but I’m satisfied that it works at least for a couple of random plane samples I tried and have no desire to try supporting every known sample in existence.

Savage Warriors ANM

This one is a curious format. I’ve managed to locate decoding functions in one of the overlay files, it looked reasonable (LZ77 compression for intra frames and something a lot like FLI delta frame compression for the rest) but the decoder did not work properly. Curiously, demo version contains some of the same animations as the full game but in slightly different format (the initial magic is missing); after comparing them I found out that the release version uses a weird format with a 32-bit value inserted after each kilobyte of data. I ended up implementing my own buffered reader that loads those kilobyte blocks and skips those additional words for the release version.

Another thing is that LZ-compressed format had 17-byte header which the decoder skipped. Of course it made me suspect of being a third-party compression scheme, and after searching around it turned out to be Diet (you may remember it being used as an executable compressor but apparently it had other uses). It somewhat reminded me of MidiVid Lossless as it is yet another codec reusing third-party general compressor (with special preprocessing for executables, which was a dead giveaway).

In either case, both flavours of this ANM format are now supported by na_game_tool (and will be the part of the next release).

Revisiting QPEG

Thursday, March 27th, 2025

Since I’ve done all improvements to NihAV that I wanted to do (beside vague “improve Indeo 3 encoder somehow” or “add some interesting format support”), I decided to look at the formats in my backlog and discovered that I have a QPEG player with a couple of DVC samples. Considering that I’ve REd their VfW codec over two decades ago, I had to look at this as well.

It turned out to be a straightforward format with static palette and video frames packed with moderately complex RLE (it has opcodes for run/copy/skip and literals). The most interesting thing there to me was that values without high bit set are treated as literals, or rather as indices in the remapping table (which is the first 128 bytes of a frame). Considering that low 20 colours of the palette seem to be unset, it makes some sense.

The hardest part was to read the binary specification. The executable uses Phar Lap 386 extender, so it’s actually stored in P3 format right after the loader. At least I have some experience with loading such formats when I messed with Linear eXecutable format before Ghidra plugins were available (and sometimes afterward as well, since e.g. neither of two known plugins managed to load Wing Nuts executable). Also I managed to spot that 0xE0 byte happens at the end of packed frame, so I guessed it was the end data marker and searched for the code using it as such. I’ve managed to locate four RLE decompression functions, all probably functionally identical, and after figuring out other details (like where remap table comes from) I ended up with the decoder that works on all four known samples just fine.

Overall, it’s nothing particularly complex but it was still nice to look at.

Random NihAV news

Saturday, March 8th, 2025

Since I can’t do anything about the world largest countries run by over 70-year old dictators, I try to find a distraction elsewhere. It usually works only until the next time I read next. Anyway, here’s something different for a change.

With all the work on na_game_tool I’ve been mostly neglecting the original NihAV (not that it makes much difference). So I’ve tried to improve it a bit and I have something to report.

First of all, I decided to make ARMovie support more complete (after discovering that e.g. the only thing that can play Eidos Escape 122 codec is their own DOS player). So I’ve added all three video codecs with known samples (Escape 122, 124 and 130) as well as their ADPCM codec. Sadly Escape 122 description in The Wiki is somewhat unclear, so I referred to the original DOS player for it. Similarly I looked at ADPCM decoding because what libavcodec produces is not quite right (but I guess Paul can tell you more stories about all those IMA ADPCM flavours). So I guess now my project has the most complete ARMovie formats support out there. There’s only one other third-party format I’m aware of (The Complete Animation film, it may be stand-alone or encapsulated in RPL) and I might get to it eventually.

The other thing is TrueMotion S. When you think there’s nothing unknown left about it, it manages to surprise you anyway. So I was looking at Discmaster search results for potential candidates (“aviAudio” is a good format for that—sometimes it is really AVI with audio track only, sometimes it is AVI with unknown video codec) and found three .duk files that were TM1 in AVI and which could not be decoded. It turned out to use codebook 0—which is not present in the open-sourced version of the decoder. Another thing is that it does not use so-called fat deltas (i.e. larger difference values), so code zero means eight zeroes instead of an escape value. Remarkably, this is demo of The Horde game by Toys For Bob, known to employ yet another exotic version of the codec in their 3DO version of Star Control II. It makes me wonder if they had the same relationship with Duck as JVC NWC with RAD (you know, the company which released a game with Bink-b videos not found anywhere else and bundling Bink-d decoder with one of their other games—all while others started with Bink-f or later).

Hopefully I’ll be able to do more in the future, but I wanted to share these stories while they’re still fresh.

Some words about AVSS format

Thursday, December 5th, 2024

Back in the day I had a cursory glance at AVS. Out of curiosity I decided to add AVSS format support to NihAV. Despite it looking like an obscure format nobody has ever heard of, it’s well-documented. Under the name of Intel DVI (you know, Digital Video Interactive—mostly remembered for DVI ADPCM under the name IMA ADPCM).

Anyway, I’ve managed to locate just one sample containing single RTV2.1 stream so I was curious how it fares.

Well, it turned out to be the standard Indeo 2 format but for some reason actual image being down-sampled compared to what the headers claim (160×100 instead of 320×200). Nothing much to say but it was still curious to see that system from the past.

Looking at AOL ART format

Friday, November 15th, 2024

Since I have nothing better to do (beside some slight NihAV refactoring) and somebody told me about it, I decided to look at the format. Apparently back in the day The Multimedia Mike also attempted to research some information about it but I don’t think anything substantial came out from it.

Anyway, here’s what I everybody knows about it: as apparent from the name, it was developed by Johnson-Grace company, it combines a lot of different image compression methods and format—so apparently you can have slide show with an accompanying MIDI or speech, and it splits image into tiles and tries to compress them using whatever method fits best.

So, here are some additional details.

Audio codec is a rather common speech codec (LPC plus quality improvement post-filters) with one peculiarity: internally it decodes 16-bit samples yet outputs 8-bit PCM.

Slide show—I had just a cursory glance but it seems to mix various kinds of content with the slide show commands (like displaying next image) in the single file. Of course it’s last in my priorities list.

Image formats—now that’s where the real fun is. It handles about twenty different chunk types (even if most of them are useless and provide some image information at best) and recognizes (and skips) about the same amount too. I’m still struggling with the code but there seem to be three types of compression: LZ77-based lossless compression, lossy compression with the same coding for coefficients that probably uses wavelet coding, and another lossy compression (for palette-based images?). So far the only things I’m sure about is that it employs LZ77-based compression that reminds me of deflate with dynamic codebooks (but differs from it or DCL) and it seem to code signed coefficients while at it; the other thing is there are way too many functions for converting palette formats (usually between 24- and 32-bit RGB but quite often it’s between 32-bit RGB and 32-bit RGB in the same format but as an integer or an array of bytes).

In either case I’m in no hurry and can keep digging into it at my leisure.

MPEG-4 ASP: done for now

Tuesday, October 15th, 2024

In my last post I mentioned I need to deal with MP3 in AVI and multi-threaded decoding. The former turned out to be a simple bug (I should’ve not trusted AVI header reporting 12-bit audio), and I gave up on the latter.

The main reason for that is what seems to be the main contribution of MPEG to the world of video coding, namely B-frames. While the idea behind them is reasonable (to code scene transitions or smooth movements as an interpolation between two keyframes), practical implementation brings headaches because those frames are coded in an order different from the display order (after all, you can’t interpolate between two frames if you haven’t decoded both of them). And of course it got worse in H.264 and later codecs where B-frames can reference other B-frames so you need to code information about the frame structure (references and how to update them).

And the problem with MPEG-4 ASP is that while it can have B-frames, its popularity it tied more to AVI container which lacks means to signal frame reordering (fun fact: the MPEG-4 ASP video files in MOV that I have would be perfect candidates for B-frames but lack them entirely). Of course later there other containers gaining popularity like Matroska or OGM (or even MP4 occasionally) but the gilded age seems to be tied to AVI. And of course that created difficulties.

If you have I- and P-frames only, there’s nothing to care about—but multi-threading won’t be that effective either. Newer implementations (Xvid 1.3.7 is rather fresh BTW) output B-frames as is so good luck knowing that in advance and performing reorder. In this case I see if the coded timebase is the same as the one reported by the container and simply re-assign timestamps from the bitstream (and if this does not work—well, tough luck). But there was a funnier intermediate solution with one frame containing data for both P- and B-frame and the following frame being a skip frame, so a decoder could replace it with an appropriate frame. This reminds of Indeo 4 which performed the same trick. And making that a multi-threaded decoding would be a mess requiring either saving frame data and scheduling it for later decoding or scheduling both frames and then trying to tie it to the upcoming frame decoding request. And playing back typical video takes about 20% of CPU load…

Thus I’ve committed what I find to be good enough for my needs and I shall forget about it—at least until some decoding artefact will annoy me enough. There’s more boring and unremarkable stuff I want to do on NihAV, working on this decoder reminded me that it can always be worse (or uglier).

P.S. For some reason repository cloning or updating from git.nihav.org still does not work (but the web interface is fine). I’ve reported the problem and hopefully it will be resolved soon. I suspect that the provider blocked it because of too many synchronisation requests from other sites trying to mirror the repositories. In either case I’m still grateful for the hosting.