Archive for the ‘Game Video’ Category

Bink encoder: doing DCT

Monday, October 9th, 2023

Again, this should be laughably trivial for anybody familiar with that area but I since I lack mathematical skills to do it properly, here’s how I wrote forward DCT for inverse one.
(more…)

Bink video encoder tricks

Sunday, October 8th, 2023

As I mentioned in the introductory post, there are nine block coding modes and my encoder tries them all to see which is good (or good enough) to be used. What I have not mentioned is that some of those blocks have sixteen variations (quantisers for DCT-based blocks and scan patterns for RLE blocks), which makes search even longer.

First of all, I use the rather obvious approach to trying blocks: order them in a sequence, motion blocks types first, then simple ones (fill, two-colour pattern, RLE) with intra DCT and raw being the last. And if the block metric is low enough then the block is good enough and the rest of the modes should not be tried.

Second, there are two encoding modes: quality-based and bitrate-based. For bitrate-based mode I simply manipulate lambda ratio between block distortion and bits and that’s it. For quality mode I actually collected statistics on different files at different quality settings to see what block types are used more or less. I.e. on the highest quality setting intra blocks are not used at all while on low quality settings you don’t see lossless residue or raw blocks.

So I simply used the option to disable different block coding modes (introduced to make debugging other coding modes easier) and modified the list depending on quality setting.

Then I went even further and observed the statistics of the DCT block quantisers used depending on quality settings. As one could reasonably expect, low quality setting resulting in quantisers 12-15 (and rarely 11) while high quality setting used quantisers 0-10 the most. So limiting the quantisers depending on quality was the next logical step.

And here are some other tricks:

  • on lower quality levels I use non-zero threshold for RLE block so that e.g. a sequence 4, 3, 5, 4 will be treated as a run of fours;
  • in the previous version of the encoder I used Block Truncation Coding (probably the only possible application of it even if a bit unwise), now I’m simply calculating averages of the values above/below block mean value;
  • in rate-distortion metric I scale the bits value, doing otherwise often leads to the essentially free skip block being selected almost every time and blocky moving pictures are better than slightly less blocky still one.

Of course it is easy to come with more but it’s been enough for me.

Bink bitstream bundling

Saturday, October 7th, 2023

Bink Video organises frame data somewhat on per-row basis and separated into individual streams. That means that data is sent in portions in interleaved fashion: first it is block types, then it’s colour values for e.g. fill or pattern blocks, then it’s motion values and so on. Before each row decoding can start, decoder should check if it still has some data from that stream or read more. For example, in the beginning you may get just enough block types transmitted for one row, zero motion vectors and one DC value for a block somewhere in the middle of the stream; so for the second row you refill only block types, and DC values stream will be queried only after that single value in it has been used.

And to complicate the things further, there may be yet another kind of data stored in-between: RLE block information (scan index, copy/run bit and run lengths), DCT coefficients and lossless residue data.

It’s also worth noting that while version 'b' simply stored stream data in fixed-length bit-fields, the latter versions employed static codebooks to improve compression.

Anyway, how to write such streams correctly? My original encoder supported only several block types and was able to bundle all stream data together to transmit it once for the whole plane. But since I wanted to support all possible block types, I had to devise more flexible system. So I ended up with a structure that holds all data in per-row entries so when the plane data is gathered I can easily decide how many rows of data to send and when more data needs to be sent. Additional bitstream data is also stored there (in form of (code, length) pairs) and can be written for each row that has it.

Beside that I’ve introduced a structure to all stream data for one block. This makes it easy to calculate how many bits will be used to code it, output the data (just append its contents to the corresponding entries in the plane data) and even reconstruct the block from it (except for DCT-based ones). Actually I use three instances of it (one for the best block coding candidate, one for the current block coding and one for trying block coding variants if the mode permits it) but that’s a minor detail.

In either case, even if this approach to coding is not unique (TrueMotion codecs employed it as well, to give one example), it’s distinct enough from the other variants I’ve seen and was still quite fun to implement.

Bink encoding: format and encoder designs

Friday, October 6th, 2023

Here I’m going to give a brief review of how Bink formats (container, video and audio codecs) are designed and how it affected the overall encoder design in NihAV.
(more…)

Documenting failure(s) of a Bink encoder

Thursday, October 5th, 2023

Since I really had nothing better to do, I decided to try writing Bink-b encoder. Again. The previous version was a simple program to compress image sequence into a somewhat working video file, the new version supports all possible features: motion compensation, all block types including DCT-based ones, quality- and bitrate-oriented compression modes and even audio.

I worked on the encoders not because I have any real need but rather in order to both understand better how Bink codecs work and what tricks can be employed in the decoder. Bink video supports different block coding modes, from rather ordinary DCT to RLE using a custom scan order and special mode for coding motion residues. Bink audio is a simple perceptual codec that should still give me some possibilities for experimentation (of course I remember how I failed to write AAC encoder, but maybe with such simple format the outcome will be not too horrible?).

In the following posts I’ll try to describe how some things in Bink format work, what difficulties I had to overcome and what the results are.

Eventually I want to test my implementation against the original decoder (by creating a custom heroes3.vid or video.vid for HoMM3). Also it is rather easy to enhance my work to produce latter-version Bink videos but I’m not sure if it’s worth doing that as there’s a good free encoder for it already (probably a much better one than I can ever create) and Bink format is not a good candidate for normal videos as it was created for game cutscenes with the only options to keep or end playing, no option for rewinding back at random place to repeat scene you missed (or skip some boring part).

In either case, it’s still a nice experience. After all, NihAV is about experimenting and learning how stuff works and Bink offered a lot of new things to try.

NihAV: adding SGA support

Saturday, September 2nd, 2023

Since I had nothing better to do this week I decided to finally add Digital Pictures SGA decoding support to NihAV. While there are many different formats described in The Wiki, I’ve decided to play only those not described there (namely $81/$8A, $85, $86 and $89).

In my previous post on this matter I mentioned that the formats I took interest in are using 8×8 tiles that may be subdivided into 8×4 or 4×4 parts and filled with several colours using a predefined pattern (or an arbitrary one for 8×8 tile if requested) plus some bits to select one of two possible colours for each tile pixel. The main difference between $81/$8A scheme and the others is that it codes all data in the same bitstream while the later versions split colours and opcode+pattern bits into two separate partitions (maybe they had plans for compressing it?) plus they store audio data inside the frame.

And here are some notes on the games (I think most of those are PC or Macintosh ports but it’s possible the same files were used in console versions of some of those games as well):

  • Double Switch—this one uses $81 compression (in still images, cutscenes embed them along with $A2 audio in $F1 chunks);
  • Quarterback Attack—this one uses $8A compression in $F9 chunks;
  • Night Trap$85 compression and megafiles (i.e. almost all cutscenes are stored in single NTMOVIE file that require some external index to access them). Also the PC release had a short documentary about the moral panic around that game (in the same format of course; in two resolutions even);
  • Corpse Killer$86 compression and one megafile for all cutscenes;
  • Supreme Warrior$89 compression, one megafile and no frame dimensions given. For most of the cutscenes it’s 256×160 but at the end (logo and maybe something else) it’s different. Additionally there are two audio tracks: some audio chunks contain twice as much data (and have high bit of size set), in that case the first half corresponds to English speech and the second half is Chinese; otherwise it’s the same for both versions (e.g. background music, fighters grunting, sound effects and so on).

Overall, it was an interesting experience even if I don’t care about the games themselves.

Looking at Tex Murphy video formats

Wednesday, August 30th, 2023

Since I have nothing better to do, I decided to look at the formats used in various Tex Murphy games.

So here they are (the game from this millennium does not count):

  • Mean Streets—it uses raw images in ILBM-inspired format (so that even some chunk names are still the same) grouped with e.g. palette files and packed in ZIP using ancient compression methods (shrink and reduce, I wonder if anything but ancient PKZIP.EXE can unpack them all);
  • Martian Memorandum—the only VID file present looks like mostly raw image data interleaved with some metadata to tell which regions to update (though judging from ScummVM sources it may also employ RLE for sprites);
  • Under a Killing Moon—this one has new format starting with 32-bit size followed by signature “PTF” and after 64-byte header I see the suspiciously familiar chunk structure: 32-bit size, 16-bit signature 0xF1FA that contain some sub-chunks of its own… I refer you to The Multimedia Mike, he should be able to recognize the format;
  • The Pandora Directive—this one uses format that starts with “H2O” and seems to be Huffman-compressed RLE;
  • Tex Murphy: Overseer—this one decided not to be creative and used Smacker files (along with some FLICs).

So, despite its nine-year history (again, only the games from last millennium) those games have changed the format used for video segments yet they have not moved further than RLE.

The original CfL codec

Thursday, August 17th, 2023

As most of you don’t know and don’t care, modern advanced video codecs may use the special prediction mode called “chroma from luma” where, as it’s obvious from the name, the chroma components are reconstructed from the luma using some coefficients. And what do you know, I’ve found a codec that used this approach back in 1997.

So there’s a French company called Kalisto Entertainment and back in the day it developed a codec for the cutscenes in some of its games (at least Dark Earth and Nightmare Creatures). 15-bit RGB video is split into three components and each is coded separately using simple LZ77-like method (i.e. it’s either RLE mode, or copy with an offset from the current or previous frame). The twist comes from the fact that those components are split into tiles (usually 20×20 ones) and each tile has coding mode and two sets of scale/offset coefficients, so for each tile one of RGB components is selected as the base one and two others are coded as the differences from the scaled (and offset) base value.

So one component plane contains the base components for each tile (which may be different for each tile) and the other two contain the differences for the predicted non-base components (which, at least in theory, should be mostly zeroes and thus better compressible). So when some people wonder if it’s time for video codecs to perform optimal component decorrelation on per-frame basis, here’s a practical codec from the last century that did it per-tile.

Looking at Digital Pictures video format(s)

Monday, June 26th, 2023

Since OSQ format requires obtaining a copy of some expensive software of unknown old version, I’ll leave it to somebody else. Meanwhile I’ve looked closer at the AVC format mentioned in the previous post and its relatives.

For those of you who don’t recognize the name immediately, Digital Pictures is the company responsible for some FMV-based action games, including the infamous Night Trap. As I rediscovered previously, the AVC files they use are really SGA files with compression method 0x81 but what about the other formats?

About half of the games I could look at contain an archive occupying most of the CD space, inside that archive are the same AVC files. Other half of the games usually has one or two AVC files with a company logo and one megamovie in various formats. And after some research it turned out to be the same 0x81 compression format but with audio data and varying headers.

And since nobody bothered to document it for The Wiki, I’ll explain it here.
(more…)

Looking at even more game formats

Friday, June 23rd, 2023

Since I have nothing better to do as usual, I decided to look at some game formats.

For instance, there’s a game called The Fuel Run promoting a product from a Swiss company supporting russian war crimes. This game has animations in VDO format. VDO file header starts with the string “Artform” and it employs RLE compression, each line of the frame prefixed with its size. What’s funny is that its RLE uses only bottom 6 bits for the run length and top bit is completely unused. For obvious reasons the format is not worthy documenting further.

Or there’s a game called Double Switch, this one has AVC videos. It uses RGB555 palette and 8×8 tiles that depending on opcode may be either coded with one of 25 predefined patterns and 1-7 colours or split into tiles of the smaller size. And only afterwards I decided to look into The Wiki and it seems to match SGA format description (except that this particular format variant is not documented). I don’t know if I should bother writing a decoder for it but with the lack of PC codecs to RE I might try my hoof at console ones instead.