Archive for January, 2021

Alaris VGPixel

Sunday, January 31st, 2021

As I mentioned in the previous post, I wanted to look at this codec because it might be related to the whole XVD family. It turned out to be completely different even if it’s from the same company and bit reading is done in the same way (which is a very minor thing).

So the codec itself is a delta codec that reminds me of BMV from Discworld Noir a lot. Essentially it just reads opcodes using static codebook and performs some action depending on them:

  • repeat previous pixel value 1/2/4 times or the fixed amount transmitted in the frame header;
  • skip 1/2/4 pixels or the fixed amount of pixels transmitted in the frame header;
  • copy top/bottom/right/left neighbour pixel;
  • set pixel to the average of top/bottom and right/left neighbour pixels;
  • or decode pixel difference (for all three components at once) and add it to the previous value.

The main problem REing it was figuring out the decoding loop since for performance reasons decoder tries to group opcodes and handle them at once thus creating 200-something cases to handle. Plus those opcode handlers are just pieces of code that work on data in registers and jump to the function that dispatches next opcode. Of course it does not decompile properly but the amount of instructions is small anyway and it’s the same code repeated over and over again.

Maybe I’ll even write a decoder for it sometime later.

Looking at XVD

Saturday, January 30th, 2021

A week ago a certain XviD developer made a request to look at something more compressed called XVD and so I did.

Ghidra and DPMI

Sunday, January 24th, 2021

When trying to look inside an old game you might have a problem if the executable is 32-bit one running with some DOS extender.

Most people do not know (and those who know probably want to forget it) that beside the usual 16-bit COMs and EXEs (that can have several segments and sometimes an overlay too—an external file for loading code or data from disk) you can have 32-bit DOS applications with a very special interface to access so called protected mode. Essentially you have two executables bundled together: actual 32-bit executable in its own format and the loader for it (or a stub to call an external loader usually called dos4gw.exe). Kinda like PE format but more flexible.

So what’s the problem with loading them? For starters Ghidra does not recognize the extended format and loads just the loader (an external LX loader plugin did not help either; but when I tried to load some ActiveX components for XVD decoding, their format was detected as LX). Another problem is that some extenders (like the one from CauseWay) also compress the payload. Luckily enough there’s DOS/32A extended that is shipped with a utility to extract the so-called linear executable which you can’t run but you can study it.

And this is not hard but somewhat tedious since (at least in my cases) it can be loaded only as raw image. From the LX header you can extract the position of actual segment data, segment sizes and their base addresses; then you can use that information to create custom memory segments for them. Now you have some (usually two) segments and one problem: the segments do not address the full virtual address but rather an offset e.g. if you have code segment starting at 0x10000 and data segment starting at 0x90000 you might have mov eax, [0x4bc] which really should be mov eax, [0x904bc]. That is why the first hundred kilobytes or more of LX is dedicated to so-called “fixups” that tell you how you should correct the addresses at the specific places in loaded data (or code). Luckily Ghidra allows patching the addresses in the instructions and re-assigning segment base addresses to have less places to patch (and calls/jumps almost always use relative offsets so it’s not an issue).

And how to find those wrong addresses for specific data item to patch? Memory search for low word part of the address works just fine.

Overall, maybe it is not the most convenient way to deal with this format but it works sufficiently good for me. Who knows, maybe in the future I’ll look at more games for not yet covered formats. At least now I know how to do that.

A look on FutureVision formats

Saturday, January 23rd, 2021

As you all know, this is a studio known primarily for the Harvester game and while that’s not a game I’d like to play I still decided to take a look at it. What do you know, the formats for both music and video are documented. The music format is supported by Game Audio Player since ages and the video format was documented on The Wiki more than a decade ago by the guy responsible for reverse engineering most game formats, Vladimir Gneushev (or simply VAG). And yet there are nuances in them…

Music format is based on IMA ADPCM with minimalistic file header and the default predictor. Which means it needs some time to adapt to the actual coming signal amplitudes. In result the old format description based on reverse engineering recommends skipping first 7-57 bytes right after the header because of the garbage sound it produces. When I looked into the binary reference (an adventure that will be described in an upcoming post), it turned out that in certain cases they simply zero out the first 50 decoded samples.

Video format is rather typical of its time: divide frame into 4×4 blocks, depending on coded flags either skip updating it, read full block data, or read two colours and 16-bit pattern telling how to paint them in a block. No additional compression, even audio is stored as PCM. There is one thing though: the lowest row should not be decoded (and the game does not do it either). It is probably an encoder bug but e.g. for skip frames it codes just bit flags that all should be zero and yet last byte there contains some non-zero garbage. Or in a frame containing changes and not just skip blocks you also have some weird block flags that would signal you need to read block data past the end of the input frame data. This does not matter much since most of the videos in the game have not just 320×200 resolution but also large black borders as well. The only exception is credits.fst which is 640×400. Another fun fact is that I got a North American version and (since the company is North American as well) it has virginlogo.fst with Merit Studios logo (plus some videos mention the company as Future Vision and others as DigiFX Interactive). I know they had changed name in the process and had different European publisher but it’s still fun discrepancy.

Meanwhile I’ve also documented DERF format used in Stupid Invaders game (video format there is based on 8×8 DCT for block coding and motion compensation that involved block division and transforms like flipping the source, which may be in the same frame too). While neither of the codecs is complicated it is still a nice change from looking at the mainstream codecs of today (it’s not that there are many non-mainstream codecs survived to look at either).

A curious Smacker-related bug

Friday, January 22nd, 2021

Here’s a thing that has been bothering me since I tested my Smacker decoder for NihAV.

It decoded various test files fine except for one game: Toonstruck.

A look at various video codecs from the 90s

Monday, January 18th, 2021

Since I had nothing better to do during Christmas “vacation” (it is the first time I’m in Germany at this time of year so of course I had nothing better to do) I looked at various codecs, mostly from last century, and wrote some notes about more interesting ones. Here I’d like to give some information about the rest lest I forget it completely.

  • Affinity Video—JPEG rip-off;
  • Lsvx—H.263 rip-off with possible raw frames;
  • Morgan TVMJ—I should’ve noticed “MJPEG” in the description sooner;
  • VDOWave 2—an unholy mix of H.263 and wavelets. It uses the coding scheme from H.263 (8×8 blocks, loop filter, halfpel motion compensation and even something suspiciously resembling OBMC) but blocks are coded as three 4×4 blocks that should be recombined using Haar transform into one 8×8 block. Plus there might be an additional enhancement layer for the whole frame based on the same wavelet as well.

And I should mention VSS Codec Light. While it is hard to get through all those levels of C++ abstractions, looks like it has arithmetic coding with static models, 4×4/8×8/16×16 blocks, 8×8 DCT, and five different wavelets variants to boot. At least it’s not another JPEG or H.263 rip-off.

Overall it feels that back in those days you had mostly JPEG rip-offs, H.263 rip-offs and wavelet-based codecs. I tried to look at more of the latter but one of the codecs turned out to be an impenetrable mess of deeply nested calls that seem to add stuff to the lists to be processed later somehow and another codec demonstrated that Ghidra disassembler has bugs in handling certain kinds of instructions involving FS register IIRC. In result it thinks the instruction should be a byte or two longer than it really is. So unless this is fixed I can’t look at it. There are still plenty old codecs I’ve not looked at.

A look at ACT-L2

Saturday, January 9th, 2021

This is yet another video codec from the 90s used for streaming and completely forgotten now. But since I had nothing better to do I decided to look at it as well.

Essentially it is another H.263 rip-off with a twist. From H.263 it took overall codec design (I/P-frames, 8×8 DCT, DC prediction, OBMC) but the data coding is special. For starters, they don’t use any codebooks but rather rely on fixed-width bitfields. And those bit values are not written as they occur but rather packed together into separate arrays. There’s a way to improve compression though: those chunks can be further compressed using binary arithmetic coder with an adaptive model to code bytes (i.e. you have 256 states and you select state depending on which bits you have already decoded).

Additionally it has somewhat different method of coding block coefficients. Instead of usual (zero-run, level, end-of-block) triplets assigned to a single code it uses bit flags to signal that certain block areas (coefficients 0-3, 4-7, 8-11 and 12-63) are coded and for the first three areas it also transmits bit flags to signal that the coefficient is coded. And only the last area uses zero-run + level coding (using explicit bitfields for each).

Overall it’s an interesting idea and reminds me of TM2 or TM2X since those codecs also used data partitioning (and in case of TM2 data compression as well).

Bink going from RAD to Epic

Friday, January 8th, 2021

So in the recent news I read about Epic Games acquiring RAD Game Tools and I guess I have to say something about that.

RAD was an example of a good niche company: it has been providing game makers with essential tools for more than quarter of century and offering new things too. Some people might remember their Miles Sound System and Smacker format from DOS times, some have heard about Bink video, others care about their other tools or recent general data compression library that even got hardware decoding support on PS5. And one of the under-appreciated things is that the developers are free to publish their research so you can actually read how their things were developed and improved. If that does not convince you of their competence I don’t know what can. (Side note: considering that you usually get useless whitepapers that evade describing how things work, the posts from Charles or Fabian are especially outstanding).

Since I’m no expert in business matters and lack inside knowledge I don’t know if it’s a good or bad step for a company and its products. Nevertheless I wish them good luck and prosperous and interesting future even if we have Electronic Arts to show us an example of what happens when a small company gets bought by a large game developer and publisher.

P.S. I would be grateful if they fill in missing details about Bink2 video but this is unlikely to happen and probably somebody caring enough about it should finish the reverse engineering.

A look on weird audio codec

Thursday, January 7th, 2021

Since I still have nothing better to do I decided to look at ALF2CD audio codec. And it turned out to be weird.

The codec is remarkable since while it seems to be simple transform+coefficient coding it does that in its own unique way: transform is some kind of integer FFT approximation and coefficient coding is done with CABAC-like approach. Let’s review all details for the decoder as much as I understood them (so not much).

Framing. Audio is split into sub-frames for middle and side channels with 4096 samples per sub-frame. Sub-frame sizes are fixed for each bitrate: for 512kbps it’s 2972 bytes each, for 384kbps it’s 2230 bytes each, for 320kbps it’s 2230/1486 bytes, for 256kbps it’s 1858/1114 bytes. Each sub-frame has the following data coded in it: first and last 16 raw samples, DC value, transform coefficients.

Coding. All values except for transform coefficients are coded in this sequence: non-zero flag, sign, absolute value coded using Elias gamma code. Transform coefficient are coded in bit-slicing mode: you transmit the lengths of region that may have 0x100000 set in their values plus bit flags to tell which entries in that actually have it set, then the additional length of region that may have 0x80000 set etc etc. The rationale is that larger coefficients come first so only first N coefficients may be that large, then N+M coefficients may have another bit set down below to bit 0. Plus this way you can have coarse or fine approximation of the coefficients to fit the fixed frame size without special tricks to change the size.

Speaking of the coder itself, it is context-adaptive binary range coder but not exactly CABAC you see in ITU H.26x codecs. It has some changes, especially in the model which is actually a combination of several smaller models in the same space and in the beginning of each sub-model you have to flip MPS value and maybe transition to some other sub-model. I.e. a single model is a collection of fixed probabilities of one/zero appearing and depending on what bit we decoded we move to another probability that more suits it (more zeroes to expect or more ones to expect). In H.26x there’s a single model for it, in ALF2CD there are several such models so when you hit the edge state aka “expect all ones or all zeroes” you don’t simply remain in the state but may transition to another sub-model with a different probabilities for expected ones-zeroes. A nice trick I’d say.

Coder also maintains around 30 bit states: state 0 is for coding non-zero flags, state 1 is for coding value sign, states 2-25 are for coding value exponent and state 26 is for coding value mantissa (or it’s states 2-17 for exponent and state 18 for mantissa bits when we code lengths of transform coefficient regions).

Reconstruction. This is done by performing inverse integer transform (which looks like FFT approximation but I’ve not looked at it that close), replacing first and last 16 coefficients with previously decoded ones (probably to deal with effects of windowing or imperfect reconstruction), and finally undoing mid/stereo for both sub-frames.

Overall it’s an interesting codec since you don’t often see arithmetic coding employed in lossy audio codecs unless they’re very recent ones of BSAC. And even then I can’t remember any audio codec using binary arithmetic coder instead of multi-symbol models. Who knows, maybe this approach will be used once again as something new. Most of those new ideas in various codecs have been implemented before after all (e.g. spatial prediction in H.264 is just a simplified version of spatial prediction in WMV2 X8-frames and quadtrees were used quite often in the 90s before reappearing in H.265; the same way Opus is not so modern if you know about ITU G.722.1 and heard that WMA Voice could have WMA Pro-coded frames in its stream).