QfG5: room image formats

November 23rd, 2023

Each room has several formats associated with it, some of them are binary, describing various objects but here I’ll describe the following formats:

  • IMG
  • NOD
  • ZZZ
  • FTR
  • ROM
  • GRA

GRA format is a more general format used for animated sprites, some window backgrounds and GUI elements as well as talking character portraits.

NOD

This has nothing to do with C&C is a palette format for the room backgrounds. It starts with QFG5 magic and 32-bit file size but in reality the game engine simply reads 1024-byte RGBA palette starting at the offset 0xA8 and that’s it.

IMG

This is actual room background data. It starts with a header consisting of two 32-bit big-endian words and two 64-bit big-endian floating-point numbers and then the same header repeated in little-endian format. The parameters are height, width, two parameters probably related to the room positioning (probably an offset and full intended room perimeter—the rooms can be circular like the Silmaria main square) and two unknown floating-point numbers.

Then RLE-compressed image follows coded in columns starting from the right edge and with its height doubled (so e.g. 4000×400 image will be decoded as 800×4000 image with the image left edge being at the bottom). The lower part of image is not used (maybe a leftover when it was used for a depth buffer). The actual data is 8-bit indices to the NOD palette with the same resource number.

RLE works in the following way: for each decoded line (or rather column) a signed 8-bit value is read; zero signals the end of line, negative values signal raw data (e.g. 0xFE or -2 means copy two following bytes to the output), positive values mean repeating the next byte the specified number of times.

ZZZ

This is a room depth map. This file contains RLE-compressed depth values (0 = closest to the screen, 255 = farthest) without any header. Data is compressed in the same way as single line of IMG format and has the same dimensions but it is stored in line-by-line format right edge first (i.e. mirrored compared to the background image);

FTR

This is a format defining room regions. It consists of signed 16-bit numbers. First number declares the number of regions in the file, then region data follows consisting of 4-word header (region depth maybe, always zero, an exit portal flag, and the number of points) and the region points (two 16-bit words each).

And here is an example of decoded room images.

Arcane Island background


Arcane Island depth map (and an Easter egg)


Arcane Island region map

ROM

This is a room properties file consisting of two integers and two floats: first field (32-bit integer) seems to be a big-endian version of the third field, second field (32-bit float) is unknown, third field (32-bit integer) declares the number of additional resources that should be loaded for the room (e.g. battle arena has three alternative views), fourth field (32-bit float) seems to specify an angle increase (for circular rooms, I suppose). In either case floating-point numbers seems to be non-zero only for the room 200 (Silmaria main square).

GRA

This is a format used for animated sprites and static images in the various parts of the game.

The file starts with 32-bit image coding mode, 32-bit number of sprite collections, 512-byte palette (in RGB555 format) followed by the 32-bit offsets to the sprite collection data,

Each sprite collection starts with the header containing the following 32-bit values: horizontal sprite position, vertical sprite position, width, height, number of sprites in the collection, delay between frames and flags. It is followed by the offsets (from the sprite collection data start) to the individual frames. Depending on image coding mode frames can be stored in the following way:

  1. mode 0—unpacked image data;
  2. mode 1—unpacked image data interleaved with depth values (i.e. in the following order: palette index byte, depth byte, palette index byte, depth byte and so on);
  3. mode 2—the same RLE compression as in IMG format;
  4. mode 3—the same as the previous mode but index 0xFF is used for the transparent pixel;
  5. mode 4—similar to the previous mode but value 0xFF signals that the original background pixel should be restored instead.

And here’s an example of a sprite from the same scene (in the same orientation as actual scene background):

I’ll probably try to cover the messages and speech next (font formats, message files and lipsync) but it may take more time. And then only 3D data will be left for figuring out. It’s a pity I don’t know much about 3D though…

QfG5: resource formats

November 23rd, 2023

There are sixteen formats known:

  • 0 – MDL (model format);
  • 1 – ANM (model animation);
  • 2 – ROM (room parameters, always two integers and floats);
  • 3 – NOD (room background palette);
  • 4 – IMG (room background image);
  • 5 – ZZZ (depth buffer for room background);
  • 6 – GRA (sprites);
  • 7 – QGM (message format);
  • 8 – FTR (some room data);
  • 9 – WAV (effects and music);
  • 10 – RGD (region data);
  • 11 – MOV (intro and cutscenes);
  • 12 – QGF (font files);
  • 13 – STR (some room data);
  • 14 – AUD (speech audio, still in WAV format);
  • 15 – SNC (lipsync data to accompany speech).

I’ll try to document all these formats (except for AUD, MOV and WAV).

QfG: SPK format

November 22nd, 2023

SPK is the archive format used for storing most of the game resources. There are four known archives: CDA.SPK which seems to contain speech (and lip synchronisation data), CDN.SPK probably with scene-specific data, HDN.SPK contains game music plus world map and some other files, HDNW.SPK contains mostly 3D models and animations (plus some QGF files).

Read the rest of this entry »

QfG5 engine: a brief overview

November 21st, 2023

So while I have no real breakthroughs, here is some information about the engine in general.

Quest for Glory V seems to be a 2.5D engine (some 3D objects interacting essentially in circular rooms using static background images) rendering output in 15-bit RGB (while using floats internally and paletted assets). The engine logic is hardcoded, partly inside the executable, partly inside dynamically loaded room modules (in Windows or Mac native format). From what I’ve already seen, there are several global objects responsible for various parts of game engine, including the main engine class with around 160 callable methods; room class takes a pointer to it and provides around ten methods that can be invoked by the engine (and which in their turn may invoke engine methods for various actions). So it’s about a megabyte of engine code and over four megabytes of code in room modules. At least I don’t need to decompile them all at once.

The game data is organised in stand-alone files and SPK archives. Cutscenes are Cinepak in MOV (though IIRC my official pirate copy re-compressed them to use SVQ1 instead), audio is MS ADPCM in WAV. Overall there are fifteen resource types known (audio, lipsync data, text messages, panorama background and its palette, 3D models and its textures, GUI decals and so on). Most of the files are contained in SPK archives which are essentially slightly hacked ZIP archives with initial header replaces with custom index per resource type and each PK entry has numbers changed so it won’t be detected as a ZIP archive without header (at least there’s no compression employed as far as I can tell).

So overall I just need to discover how the main engine loop works, what are all those file formats and reconstruct the room logic. That should take a long time (supposing I don’t give up earlier). Either way I’m doing it mostly to pass time and to find out how far I can get REing something more complex than a codec.

A new project

November 21st, 2023

As I mentioned previously, NihAV is feature complete so while I may still return to improve it, it is not likely that there will be much work to to be done there. That is why I decided to try my hoof at something different—trying to reverse engineer (and maybe re-create) a game engine. I’ve chosen Quest for Glory V for this task as I still have interest in that game and fine folks from ScummVM are not likely to work on it (as it’s a hardcoded engine, more about it later). I fully understand that I may fail as my knowledge about game engines is about zero and having about five megabytes of code to decompile may be too much so I may give up out of boredom as well. In either case it’s not a big loss.

I’m not sure if I’ll ever release the final code (if there is any final code) but at least I will try to document the formats and inner engine working for the posterity. Otherwise it will be about as bad as with the modding community: I’ve seen people making better 3D models for the game and even patching the game logic to do but there is almost no public information about the formats (except for the trivial ones) let alone about something more serious. I know Mike suggested Xentax Wiki once as a better place for such work but I can’t find it so I’ll just keep blogging here.

Money and Multimedia

November 14th, 2023

Inspired by recent events.

It is no secret that sometimes (or rather often, I’d say) political and business considerations prevail over technical ones. The persistent rumour said that MP3 format was not so bad originally but during the standardisation phase it had been changed to contain QMF in addition to MDCT because a certain company still help a patent on it. We have a couple of video codec standards developed not for any technical merit but rather for trying to create a patent-free formats (and failing at that). We see how many modern formats (not just audio or video, but streaming protocols as well) are essentially “one of everything” because each company tries to put its own technology there (probably for patent considerations)—and then even more companies appear with a claim to own a patent on the same technology (some of them form a patent pool, some act on their own). And of course we see Nokia (not the dead phone company and not the tyre producing one either) trying to become the SCO of this decade.

You know, the modern patent system was formed with the intent of sustaining development of new inventions: an inventor brings benefit to society with new inventions, society repays by granting that inventor a protection on exclusive rights for those inventions allowing to get profit from them. In theory a mutually beneficial scheme but people always find a way to game system and here we are. IMO the best patch to the legal system would be to strip those abusing their rights of that right, be it copyright (material part), industrial property rights or anything else. But as an optimist I expect the legions of lawyers to find a workaround for it rather fast.

Anyway, I wanted to demonstrate how political and financial interests spoiled already undead (I’ll elaborate below why I think so) project. And how a certain Frenchman paved a road with good intentions there. Of course I’m talking about FFmpeg (or jbmpeg as I name it after the current most influential person).
Read the rest of this entry »

NihAV: nothing left to do

November 11th, 2023

If anybody read my previous posts, he might’ve picked a notion about me complaining that there’s nothing left to do for NihAV and it is really a problem I have.

Since the (re)start of the project in 2017 it grew from a small package that could only read bits and bytes to a collection of crates supporting various multimedia formats and a set of tools to use them. I had two principal goals: to experiments with the framework design and learn how various multimedia concepts are implemented and also (ideally) make an independent converter and player so I don’t have to rely on the external projects for most of my multimedia needs.
Read the rest of this entry »

A look at Winnov WINX

November 3rd, 2023

It is really a coincidence that about a week after I looked at their Pyramid codec I got reminded that there’s another codec of theirs exists, probably related to the WNV1 codec I REd back in 2005.

So apparently the codec codes YUY2 in 8×8 blocks. Each block is prefixed with a bit telling whether it’s a coded or skipped block. Coded block have additional 4-bit mode that seems to determine which quantisation they’ll use. The data is packed as deltas to the previously decoded value (per-component) using static codebook with values in -7..7 range (plus scaling by shifting left). There’s also an escape value in case raw value should be read instead. Overall it feels like Winnov Video 1 coding.

In other words, nothing remarkable but still a bit more advanced than usual DPCM-based intermediate codecs.

A look at Winnov Pyramid codec

October 27th, 2023

Since I still have nothing better to do, I decided to take a look at some old codec. Apparently I tried looking at it before and abandoned it because Ghidra cannot disassemble its code properly let alone decompile. I think this is a recurring theme with the old 16-bit code, especially the one reading data using non-standard segments.

So I located Sourcer, the best disassembler of the era (that seems to be abandonware nowadays but I cannot swear on that) and used it to disassemble the binary, referring to Ghidra database to locate the functions I should care about. It is not that much fun to translate assembly by hand but at least there was not that much of it.

The codec itself turned out to be a moderately complex DPCM codec compressing 7-bit YUV 4:1:1 data using per-frame codebook and not so trivial delta compression. Codebooks contain pair of delta values calculated depending on number of bits per delta. The data is coded per plane with prediction running continuously for all pixels in the plane:

  // before decoding data
  (delta0, delta1) = get_code();
  pprev = 64;
  prev = 64 + delta0;
  pdelta = delta1;
  for each pixel pair {
    (delta0, delta1) = get_code();
    delta = ((prev + delta0 - pprev) >> 3) + pdelta;
    pix0 = clip_uint8((prev + delta) * 2);
    pix1 = clip_uint8((prev - delta) * 2);
    pprev = prev;
    prev += delta0;
    pdelta = delta1;
  }

Normally such codecs would not bother to generate a codebook for the specific delta size or use something more complex than pix = prev + delta; so this was a rather interesting codec to look at. Hopefully there will be more of interesting formats to study even if sometimes I get the feeling that all undiscovered formats are either trivial or rip-offs of some standard.

Looking at Motion Pixels

October 24th, 2023

There is this very Sirius (or Sirius Publishing, more precisely) family of video codecs (plus one container format) apparently developed by two guys (who like to spam their name even in junk sections of AVI files). Also initially it had its own container format but later they’ve started to target AVI.

Another peculiarity of this format is that initially it targeted games but later was also used as a crappy Video CD alternative.

Back in the day Gregory Montoir REd the original game format for one of the game engine re-implementations he’s famous for and donated the code to FFmpeg as well. Since that time I was curious whether that code can be adapted to play MVI1 and MVI2 as well but the codec itself turned me off.

The codec itself is perverted, both in code and interface. Also it’s inherently interlaced. Normally video codecs in AVI can be recognized by their FOURCC and pass additional configuration parameters in the additional header data. Here they decided to use half of FOURCC to pass configuration flags to the codec and use stream handler FOURCC (that most apps ignore) to tell their decoder should be used to handle it. This alone would make me want to not support it ever, but the binary specification is worse.

Looks like the code consists mostly of handwritten assembly because I don’t know which compiler may generate this madness. There are many versions of the codecs, most of them are 16-bit and the 32-bit version is no better. For starters, it uses segments.

Not so many people remember DOS times and its memory models, even fewer remember them fondly. And almost nobody remembers that in 32-bit mode you can also use FS and GS registers to have custom addressing modes. Well, this codec uses them: it sets FS to the context pointer so context fields are accessed as mov EAX, dword ptr[1A8h] while global variables are accessed as mov EAX, dword ptr GS:[SYM] and of course no decompiler likes that. I was able to work around it in Ghidra by creating a new segment starting from zero but it’s still annoying.

Another thing is (ab)using registers to the full extent. Functions pass their parameters implicitly in the registers, using stack only to save those values before a loop or form a list of rectangles to process. And of course it uses this annoying (for the decompiler) feature as using the same register for two loop counter (e.g. top byte for the outer loop and low byte for the inner loop). As the result Ghidra can’t decompile it properly or even ignores whole blocks of the code because to its belief they can’t be invoked—and it’s still better than decompiling 16-bit version of MVI1 which made decompiler commit suicide. As the result some functions are easier to hand-translate from the assembly.

In either case looks like despite all the improvements it remains about the same as the initial version: data is coded as 5-bit YUV internally and stored using Huffman codes, quantisation and change maps (rectangles that tell which areas to update/fill). MVI2 can use ten different frame decoding modes that differ in how the deltas are coded but essentially it remains the same. They have not even gotten to introducing a proper motion compensation it seems.

So, now I’ve had a good long look at the codec, found nothing interesting there that was not known before and can forget about it. If only there was something more interesting to look at…