Archive for the ‘Game Video’ Category

A side project for NihAV

Sunday, July 7th, 2024

Since I still have nothing much to do (and messing with browsers is a bad idea unless the situation is desperate), I decided to make a NihAV-lite project. So announcing na_game_tool.

This is going to be a simple tool to convert various game and image formats (and related) into image sequence, WAV or raw AVI (which then can be played or processed with anything conventional). I’ve begun work on it already but the release will happen when at least when I implement all planned features (which is writing image sequence in BMP format, AVI output and porting two dozen of half-baked decoders I wrote to test if I understood the format).

Why a new project? Because I have nothing better to do, it still may be marginally useful for somebody (e.g. me) and I can do some stuff not fitting into NihAV (for example, decode 3DO version of TrueMotion video split into four files) and I don’t have to bother about other stuff that fits demuxer-decoder paradigm poorly and requires inventing ways to convey format-specific information from the demuxer to the decoder. In my case I simply feed the input name to the input plugin and it returns frames of decoded audio or video data. Some hypothetical Antons might ask a question how to deal with the formats that use variable delay in milliseconds between frames instead (and I’ve implemented one such format already). To which I can reply that one can fit a square peg in a round hole in two ways—by taking a larger hole or by applying more force. The former corresponds to having, say, fixed 1000fps rate and send mostly the same frames just to have constant rate; the latter is forcing a constant framerate and hoping for the best. I chose the latter.

The design is rather simple: there’s a list of input plugins and output plugins. Input plugin takes input name, opens whatever files it needs, outputs information about the streams and then decoded data. Output plugin takes input name, creates whatever files it needs, accepts stream information and then receives and writes frames.

Probably there’s a better alternative with librempeg but you’d better go read about it on Paul’s blog.

REing non-Duck VP X1

Thursday, June 13th, 2024

While I’m still looking for a solution on encoding video files with large differences with TrueMotion, I distract myself with other things.

Occasionally I look at dexvert unsupported formats to see if there’s any new discovery documented there in video formats. This time it was something called VPX1.

I managed to locate the sample files (multi-megabytes ones starting with “VPX1 video interflow packing exalter video/audio codec written by…” so there’s no doubt about it) and an accompanying program for playing them (fittingly named encode.exe). The executable turned out to be rather unusable since it invokes DPMI to switch to 32-bit mode and I could not make Ghidra decompile parts of the file in 386 assembly instead of 16-bit one (and I did not want to bother to decompile it as a raw binary either). Luckily the format was easy to figure out even without the binary specification.

Essentially the format is plain chunk format complicated by the fact that half of the chunks do not have size field (for palette chunk it’s always 768 bytes, for tile type chunk it’s width*height/128 bytes). The header seems to contain video dimensions (always 320×240?), FPS and audio sampling rate. Then various chunks follow: COLS (palette), SOUN (PCM audio), CODE (tile types) and VIDE (tile colours). Since CODE is always followed by VIDE chunk and there seem to be a correlation between the number of non-zero entries in the former and the size of the latter, I decided that it’s most likely a tile map and colours for it—and it turned out to be so.

Initially I thought it was a simple bit map (600 bytes for 320×240 image can describe a bit map for 4×4 tiles) but there was no correlation between the number of bits set and bytes in tile colours chunk. I looked harder at the tile types and noticed that it forms a sane 20×30 picture so it must be 16×8 tiles. After some more studying the data I noticed that nibbles make more sense, and indeed only nibbles 0, 1, 2 and 4 were encountered in the tile types. So it’s most likely 8×8 tiles. After gathering statistics on nibbles and comparing it to tile colours chunk size I concluded that type 2 corresponds to 32 colours, type 4 corresponds to 1 colour and type 1 corresponds to 16 colours. Then it was easy to presume that type 4 is single-colour tile, type 1 is downscaled tile and type 2 is a tile type with doubling in one dimension. It turned out that type 2 tile repeats each pixel twice and also uses interlacing (probably so video can be decoded downscaled on really slow machines). And that was it.

Overall, it is a simple format but it’s somewhat curious too.

P.S. There’s also DLT format in the same game which has similarly lengthy text header, some table (probably with line offsets for the next image start) and paletted data in copy/skip format (palette is not present in the file). It’s 16-bit number of 32-bit words to skip/zero followed by 16-bit number of 32-bit words to copy followed by the 32-bits to be copied, repeat until the end. Width is presumed to be 640 pixels.

P.P.S. I wonder if it deserves a support via stand-alone library named libvpx1 or libvpx and if this name is acceptable for Linux distributions.

Some words on IBM PhotoMotion

Thursday, June 6th, 2024

After a recent rant about search systems I decided to try to find any information about the format (I just happened to recollect that it’s supposed to exist). I don’t know if anybody was lucky but for me the search results were mentions in the list of FOURCCs, some passing references in two papers and that’s all. Now it will probably start returning more results from domain though 😉

So what should we do when a generic search engines fail? Resort to the specialised ones of course. Thanks to the content search feature of I was finally able to locate a CD which uses PhotoMotion technology with both video files and the official player (aptly named P7.EXE, I couldn’t have given it a better name myself). Even better, video files were encoded as both AVI and MM so I could check what output to expect.

Of course Peter’s decoder can’t handle them properly because of the larger header (26 bytes instead of usual 22 or 24 bytes) and uncompressed intra frames. But it was simple to write a simple stand-alone decoder for it to validate that both PhotoMotion and game samples are decoded fine.

This is no major achievement of course but at least it answers a question what that format is all about. So even if there’s still no information about an alleged VfW decoder, now we know what to expect from it.

All of legendary animation formats

Friday, May 31st, 2024

Since I’m still not in the mood to do something serious, I decided to play some adventure games from Legend Entertainment (you know, parser-based, mostly with EGA graphics). And some of their VGA games like Companions of Xanth or Eric the Unready contain full-screen cutscenes in unknown format. The earlier released Gateway II used the standard FLIC for many of those, SVGA games switched to .Q format—but nothing obvious for these ones. Blackstone Chronicles used QuickTime and thus is not interesting.

So I decided to look what ScummVM source code has to say about it (unrelated fun fact: it’s the only open-source project I donated some money to). Of course there’s a fork with some halfway done support of later Legend Entertainment adventure games, including its picture format support (no such luck for EGA-only games, it seems).

Apparently .PIC files are more a collection of sprites and even full backgrounds. Depending on the game one file may contain all room backgrounds, or just single area backgrounds plus related animations (e.g. flowing water or burning torches), or it may be character sprites, or—as one should expect—a cutscene.

Frames in .PIC may be full frames or delta frames. In the later case they only update a part of the screen (either by replacing an area or XORing it). The more interesting thing is how frame data is compressed. The reference code is reverse-engineered and not so informative, so it took some time to understand it. First apparently there are tables and code used to generate new tables, which turned out to be exactly what I suspected them to be—Huffman codebooks. After a bit more messing with the algorithm, it turned out to be yet another LZ77-based compressor with static codebooks and rather complicated coding that reminds me of deflate… Of course I got suspicious at this stage (it looked a bit more complex than in-house developed compression schemes for game engines usually are) and indeed, it turned out to be Pkware Data Compression Library. And I could’ve found that out simply by looking into strings in one of the overlay files. Oh well…

At least it’s yet another puzzle with formats solved. Also it’s the second time I recently encounter animation format using DCL for compression (previously it was Gold Disk animation). Which makes me wonder what other common LZ77 flavours were used in the animation and video formats. deflate (along with newer contenders like LZO, FLZ and such) is very common in screen-recording codecs. LZS was used in Sierra games .RBT, I vaguely remember RNC ProPack being used by some video format. Nightlong on Amiga used PowerPacker. Did anything use LZX (it came from Amiga before being bought by M$ so maybe it had a chance there)? LZRW? Homebrew schemes are dime a dozen (like all those LZSS variations), I wonder more about the (de facto) standard compression libraries being used in video compression. Anyway, I’ll document them as I encounter them 😉

Duck Control 1

Saturday, May 18th, 2024

Back in the day there was no Bob but he had toys nevertheless. And of those toys for Bob was Star Control II. It is not a game for me to play (since my reaction time is not good enough for space battles), its 3DO version mentions “TrueMotion “S” Video Compression by The Duck Corporation” in the credits. Now that’s more my thing! I looked at it once, found it to be a lot like TrueMotion 1 but split into different files and looked no further.

But recently I got a curious request: apparently some people want to reuse the player code out of the game for playing videos on the console. And since the console is too old to support even micro-transactions (despite coming from the EA founder) let alone FullHD AV1, they look for something more fitting, which is mostly TrueMotion and Cinepak. As I remember, this format offers a bit more flexibility than the usual TM1 in some aspects (since the tables are transmitted in the video data instead of being stored in the decoder), so writing an encoder for it may be more interesting than it was for plain TM1.

Anyway, here I’d like to talk about this technology and how it differs from the conventional TrueMotion 1.

A look at compressed game video formats

Friday, March 22nd, 2024

In order to distract myself from the thoughts why most politicians have their heads so deep in their asses that French ones out of all people seem to be the bravest, and what can I do to help destroying russia beside regular donations, here’s a post about completely unrelated REing work.

Since I had nothing better to do, I looked at the format use in Azrael’s Tear game and re-visited Talisman game. And while one of them is a 3D game from a British developer and another one is a 2D game from a German one, the cutscene formats used there have more in common than one would suspect (and their design distinguishes them from the majority of the formats).

It is common for various game formats to represent frames as small blocks with motion compensation or raw data, but I can’t remember any other such format that would use data compression on container level instead of it being a part of e.g. video data compression (of course there’s Ogg Matroska that implements such feature, but beside it I can’t think of any format doing that). In both of these formats (with not so creative extensions ANI and MOV) static Huffman compression is employed to compress several chunks of different data type. In ANI (used in Talisman game) data is split into groups of frames that start with a chunk defining which one of the predefined Huffman trees should be used to pack them all and to what symbols the codes should be assigned. In MOV (used in the other game, naturally) audio and video frame blocks are grouped into larger chunks and those chunks may be optionally compressed using static Huffman tree transmitted in the beginning of chunk payload.

ANI format features another peculiarity: there are other codecs that use motion compensation plus rotation (like formats from Cryo Interactive IIRC) but I can’t think about any other format that performs motion compensation and replaces one of the pixels in a block with a new value. Of course it is not a revolutionary idea but I haven’t seen it implemented like that before.

And that’s why I like those old game formats: they may be not the most effective ones but they contain more originality than the modern formats. The main problem is finding such a format—there are too many games released (which are also sometimes too hard to find) and the majority of them uses FLIC or Smacker anyway (or Cinepak in AVI or MOV for newer ones). But sometimes I encounter a mention of a game in some review and get lucky. I hope such finds happen more often though…

Looking at Motion Pixels

Tuesday, October 24th, 2023

There is this very Sirius (or Sirius Publishing, more precisely) family of video codecs (plus one container format) apparently developed by two guys (who like to spam their name even in junk sections of AVI files). Also initially it had its own container format but later they’ve started to target AVI.

Another peculiarity of this format is that initially it targeted games but later was also used as a crappy Video CD alternative.

Back in the day Gregory Montoir REd the original game format for one of the game engine re-implementations he’s famous for and donated the code to FFmpeg as well. Since that time I was curious whether that code can be adapted to play MVI1 and MVI2 as well but the codec itself turned me off.

The codec itself is perverted, both in code and interface. Also it’s inherently interlaced. Normally video codecs in AVI can be recognized by their FOURCC and pass additional configuration parameters in the additional header data. Here they decided to use half of FOURCC to pass configuration flags to the codec and use stream handler FOURCC (that most apps ignore) to tell their decoder should be used to handle it. This alone would make me want to not support it ever, but the binary specification is worse.

Looks like the code consists mostly of handwritten assembly because I don’t know which compiler may generate this madness. There are many versions of the codecs, most of them are 16-bit and the 32-bit version is no better. For starters, it uses segments.

Not so many people remember DOS times and its memory models, even fewer remember them fondly. And almost nobody remembers that in 32-bit mode you can also use FS and GS registers to have custom addressing modes. Well, this codec uses them: it sets FS to the context pointer so context fields are accessed as mov EAX, dword ptr[1A8h] while global variables are accessed as mov EAX, dword ptr GS:[SYM] and of course no decompiler likes that. I was able to work around it in Ghidra by creating a new segment starting from zero but it’s still annoying.

Another thing is (ab)using registers to the full extent. Functions pass their parameters implicitly in the registers, using stack only to save those values before a loop or form a list of rectangles to process. And of course it uses this annoying (for the decompiler) feature as using the same register for two loop counter (e.g. top byte for the outer loop and low byte for the inner loop). As the result Ghidra can’t decompile it properly or even ignores whole blocks of the code because to its belief they can’t be invoked—and it’s still better than decompiling 16-bit version of MVI1 which made decompiler commit suicide. As the result some functions are easier to hand-translate from the assembly.

In either case looks like despite all the improvements it remains about the same as the initial version: data is coded as 5-bit YUV internally and stored using Huffman codes, quantisation and change maps (rectangles that tell which areas to update/fill). MVI2 can use ten different frame decoding modes that differ in how the deltas are coded but essentially it remains the same. They have not even gotten to introducing a proper motion compensation it seems.

So, now I’ve had a good long look at the codec, found nothing interesting there that was not known before and can forget about it. If only there was something more interesting to look at…

Encoding Bink Audio

Wednesday, October 11th, 2023

As I mentioned in the introduction post, Bink Audio is rather simple: you have audio frames overlapped by 1/16th of its size with the previous and the following frame, data is transformed either with RDFT (stereo mode) or DCT-II (per-channel mode), quantised and written out.

From what I can tell, there are about three revisions of the codec: version 'b' (and maybe 'd') was RDFT-only and first two coefficients were written as 32-bit floats. Later versions shaved three bits off exponents as the range for those coefficients is rather limited. Also while the initial version grouped output values by sixteen, later versions use grouping by eight values with possibility to code a run for the groups with the same bit width.

The coding is rather simple, just quantise bands (that more or less correspond to the critical bands for human ear), select bitwidth of the coefficients groups (that are fixed-width are not related to the band widths) and code them without any special tricks. The only trick is how to quantise the bands.

Since my previous attempts to write a proper psychoacoustic model for an encoder failed, I decided to keep it simple: the encoder simply tries all possible quantisers and selects the one with the lowest value of A log2 dist+λ bits. This may be slow but it works fast enough for my (un)practical purposes and the quality is not that bad either (as much as I can be trusted on judging it). And of course it allows to control bitrate in rather natural way.

There’s one other caveat though: Bink Audio frames are tied to Bink Video frames (unless it’s newer Bink Audio only container) and thus the codec should know the video framerate in order to match it. I worked around it by introducing yet another nihav-encoder hack to set audio timebase from the video so I don’t have to provide it by hand.

So that’s it. It was a nice experiment and I hope (but not expect) to think of something equally fun to do next.

Bink encoder: coefficients coding

Tuesday, October 10th, 2023

Somewhat unrelated update: I’ve managed to verify that the output of my decoder works in the Heroes of Might and Magic III properly even with sound after I fiddled with the container flags. The only annoyance is that because of DCT discrepancies sometimes there are artefacts in the form of white or black dots but who would care about that?

At last let’s talk about the one of the most original things in the Bink Video format (and considering the rest of the things it has, that’s saying something).

Bink encoder: doing DCT

Monday, October 9th, 2023

Again, this should be laughably trivial for anybody familiar with that area but I since I lack mathematical skills to do it properly, here’s how I wrote forward DCT for inverse one.