As I mentioned last month, I decided to reverse engineer Motion Pixel codecs and after a lot of failed attempts to make decoder work I’ve finally got something.
First of all, here are two frames from different videos.
MVITEST.AVI
:
And a frame from SHOT09.AVI
(from Apollo 18 game):
As you can see, the images are not perfect but recognizable already. And just last week it would’ve been a mess of colours with some barely recognizable distorted shapes.
The reason for this is that while MVI1 (and MVI2) is indeed based on stand-alone MVI format (I call it MVI0 for clarity), there are some nuances. On the first glance MVI0 and MVI1 are the same—all steps are the same—and indeed you can use the same code to decode data from either, but the reconstruction steps differ significantly.
Essentially there are four steps there: decode rectangles defining which parts of the frame will be left intact or filled with one colour, decode deltas used to reconstruct the rest of pixels, use those deltas to generate predictors for each line (where needed), use the rest of deltas to reconstruct the rest of pixels. Additionally MVI employs chroma subsampling mode so only one pixel in 1×1 to 4×4 block (depending on mode) has delta differences applied to chroma, all other pixels update only luma component. So if you don’t do it correctly you may end up applying e.g. deltas intended for luma to chroma components and vice versa. That’s what I got for a long time and could not understand why.
It turns out that vertical prediction has its pixel sampling at different position—or maybe it’s scrambled in the same way as line prediction. There for the most common mode (one set of chroma components per 4×4 block) each group of four lines is decoded in reverse order (i.e. 3, 2, 1, 0, 7, 6, 5, 4, …). For 2×2 block only lines in pairs are reversed. You can see artefacts of wrong prediction on Apollo frame.
Anyway, having a recognisable picture means that the hardest part (for MVI1) is done, so all is left now is to fix the remaining bugs, refactor the code and move to MVI2. There are other annoying things there but now I know how to deal with them.
BTW, if you’re curious why it takes so long, the problem is the binary specification being obfuscated to the point that Ghidra
refuses to decompile most of MVI1 decoder functions and can’t do much about reconstruction functions since they’re a mess of spaghetti code (probably written in assembly language directly) so it’s more of a state machine than a decoding loop. And they abuse segment registers to access different parts of the context (and this must be the reason why it cannot work under OSes from this millennium). I got some progress when I resorted to debugging this mess by running MVI2 player OllyDbg
under Win95 (emulated in DosBox-X
) and constantly referring to Ghidra
to see where to put breakpoint to trace a certain function. That process is definitely not fun for me but it gave results.
Overall, probably it could’ve gone better but I hope the rest won’t take as long.
Good progress. I revisited a MotionPixels movie-on-a-CD a few years ago and wondered why it was still unsupported in the open source world. Didn’t realize they made efforts to obfuscate the decoder. They really thought they made something valuable in this codec.
More likely it was human-assembly coded. And very novel coding that is smarter than gif.
I suspect it was not intentional obfuscating but rather a side effect of trying to make decoding fast even on anemic machines by the standard of those times. Hence writing it in assembly, using all kinds of crazy tricks to shave off cycles, MVI2 decoder also started to pack deltas into nibbles so frame rendering code doubled in order to handle decoding pixel deltas continuing from high or low nibble. And there are new features of course.
But it was the norm at the time. Look at Duck Truemotion. IIRC they also attempt to build a chain of functions to decode whole line with different block types—in C with some platform-specific assembly (plus other tricks like SWAR). There are other similarities between them beside YUV-based DPCM coding with obfuscated decoder (MVI1 also has golden frames, to give one example). I hope this all will make a fun post about format peculiarities and perversions in the end.
Yeah, this tracks (the theory about the hand-crafted ASM). I suddenly recall exchanging emails with someone involved in the company who was still proud that the codec outperformed its contemporaries in decoding speed.