A look at more formats

April 16th, 2024

As I mentioned in a recent post, I’ve tried using discmaster.textfiles.com to search for more exotic multimedia formats. Here’s a short report of the found formats of some interest.

I mostly looked at the formats listed as video but that could not be decoded. Or audio-only AVIs—some of them are really audio-only, others feature a video stream that was not recognized.

So, what I’ve found:

  • DK Animation—this turned out to be a simple RLE-based animation+sound format used in some interactive encyclopedias. It was rather easy to figure out format from the samples, while executables were rather useless (due to program design it’s next to impossible to locate the code responsible for animation handling without decompiling all of it;
  • PI-Video (used in a different set of interactive encyclopedias) turned out to be a simple quadtree-based codec (frame is divided into square tiles, each tile can be skipped, filled with one colour, subdivided further or, in case of 4×4 tile, filled with raw image). Additionally pixel values may be further compressed with LZW. That proved out to be the most interesting format out of the bunch;
  • there were a bunch of RIFF and IFF-based formats, often without a known decoder. Maybe I’ll look at them one day when I feel really desperate, but not today;
  • ESCP codec is a variation of Escape 130. After I changed FOURCC to the recognized E130 the file was somewhat decoded: there were countless decoding errors, visual garbage yet it produced almost perfect complex parts of the frame as well. I suspect it may have e.g. an additional field or two or some small bitstream tweaks;
  • and a special mention to tmot FOURCC which of course turned out to be TrueMotion 1 video.

It’s random finds like this that make life a bit less dull.

A subjective look on game industry

April 12th, 2024

It is hard to say something about russia that I haven’t said before, especially without resorting to expletives. Similarly it’s hard to say something new about the countries that care more about the terrorist state well-being more than about own reputation and safety. So here’s a random rant on a completely different topic which I’ve been wanting to write for a long time.

Those who know me are also aware of the fact that I prefer playing adventure games (and sometimes strategies) from 1990s but not something from 2000s or newer. That does not mean though that I’m not aware of more recent developments (my other hobby is knowing random information I have no use for, after all). So here I’d like to present my rather obvious views on the gaming industry and why Sturgeon’s law is too optimistic for this domain.
Read the rest of this entry »

A quick look at Gold Disk Animation

April 10th, 2024

Since I’m still looking for a thing to reverse engineer, I decided to see if this file service at discmaster.textfiles.com could offer some exotic formats. And indeed it can.

So there’s this AWI or AWM file format (it’s called AWI in the decoder libraries but the files I could find have extension .awm).

So this is more of a presentation format which has nested structure with chunk names in capital letters containing other chunks while chunks (i.e. everything is contained inside GDAW chunk, actual assets like PALT or BKGD stored inside RSRC chunk and presentation scenario probably being stored in SEEN chunk) with lowercase names having various specific data attached to them (e.g. psnm is followed by Pascal-style string with asset name, tzim contains compressed image data and nndn marks end of object data).

I have not looked too deep into it (no idea how the scenario works or what are the various object parameters) but here’s some information about resource types:

  • RLE4—a 16-colour RLE-compressed BMP, I presume;
  • RLE8—ditto but with 256 colours;
  • PALT—some global palette (but images still have their own);
  • BKGD—DCL-compressed background BMP;
  • ACTR—DCL-compressed BMP used as sprite;
  • WIPE—transition effect definition;
  • SWND—DCL-compressed WAV.

The most curious thing for me is that it used Pkware Data Compression Library to compress data. And while WAV files are compressed in one piece, BMPs are compressed as separate chunks—14-byte BMP header, 40-byte DIB header, palette, and image data. I think this was a conscious decision from the format and tool designers (in order to improve compression ratio a bit).

I’ll probably try to dig some more details and document it but the most interesting part for me (i.e. figuring out its outstanding design features) is done already.

A look at compressed game video formats

March 22nd, 2024

In order to distract myself from the thoughts why most politicians have their heads so deep in their asses that French ones out of all people seem to be the bravest, and what can I do to help destroying russia beside regular donations, here’s a post about completely unrelated REing work.

Since I had nothing better to do, I looked at the format use in Azrael’s Tear game and re-visited Talisman game. And while one of them is a 3D game from a British developer and another one is a 2D game from a German one, the cutscene formats used there have more in common than one would suspect (and their design distinguishes them from the majority of the formats).

It is common for various game formats to represent frames as small blocks with motion compensation or raw data, but I can’t remember any other such format that would use data compression on container level instead of it being a part of e.g. video data compression (of course there’s Ogg Matroska that implements such feature, but beside it I can’t think of any format doing that). In both of these formats (with not so creative extensions ANI and MOV) static Huffman compression is employed to compress several chunks of different data type. In ANI (used in Talisman game) data is split into groups of frames that start with a chunk defining which one of the predefined Huffman trees should be used to pack them all and to what symbols the codes should be assigned. In MOV (used in the other game, naturally) audio and video frame blocks are grouped into larger chunks and those chunks may be optionally compressed using static Huffman tree transmitted in the beginning of chunk payload.

ANI format features another peculiarity: there are other codecs that use motion compensation plus rotation (like formats from Cryo Interactive IIRC) but I can’t think about any other format that performs motion compensation and replaces one of the pixels in a block with a new value. Of course it is not a revolutionary idea but I haven’t seen it implemented like that before.

And that’s why I like those old game formats: they may be not the most effective ones but they contain more originality than the modern formats. The main problem is finding such a format—there are too many games released (which are also sometimes too hard to find) and the majority of them uses FLIC or Smacker anyway (or Cinepak in AVI or MOV for newer ones). But sometimes I encounter a mention of a game in some review and get lucky. I hope such finds happen more often though…

QfG5: the end

March 16th, 2024

As I hinted in my previous posts, I’ve decided to stop working on it. This post should serve as a conclusion, explaining my reasons behind it and mentioning some things about the engine I have not mentioned earlier.

The main factor is the diminishing returns with the rapidly increasing efforts required to get them. I.e. locating the code for loading and parsing different resource files was not so bad (even if a good deal of resources were not parsed immediately after loading but rather treated as some structure data and accessed during use, e.g. model or animation data at each render call). Figuring out the overall engine workflow was not so bad either even if took more time. But things related to 3D are hard because of their non-standard nature (more about it below) and Ghidra not decompiling x87 code correctly in all cases. And the in-world objects interactions are even worse as it is done in the conventional C++ object-oriented fashion (and some bits in less conventional Smalltalk object-oriented fashion) so figuring out what objects are implemented as which classes is not fun, let alone all those variable-length messages that may be sent by them (and handled by a different class).

So I looked at the amount of work before me (implementing GUI, which is easy; implementing 3D rendering stuff, which is too complicated; implementing hardcoded logic, which is too tedious—and that’s before you remember that rooms may also implement their own custom logic in DLL files) and decided that I can stop here as I’m not going to re-implement the engine in any usable form anyway.

Of course I could’ve advanced farther but my inability to make 3D rendering work. I mentioned in the post about room backgrounds how I did not get it exactly right because Ghidra sometimes fails to decompile x87 code introducing variables like extraout_ST1 and you have to guess its value yourself and sometimes outright lying by e.g. using multiplication instead of division—and the x87 code is annoying enough for me to translate it by hoof. Additionally I upgraded Ghidra to 11.0 in a foolish hope that it will improve things, but that was a mistake—not only it did not make any better decompilations for the concerned code but it also made things worse by forcing alignment on function arguments which was not done before (and considering how many functions mixed integers, pointers and doubles as their arguments, I had to correct annoyingly many function prototypes to put them in order again); additionally it changed some interfaces so LX loaders do not work with it yet (it is unrelated but still complicates thing for me; and that is why I’m not so eager to move to the latest version of the software I actively use). In theory spending a bit more time on maths I could get it right but model rendering proved out to be even worse.

I have next to no experience with 3D renderers, especially triangle-based (the books tend to describe ray-tracing approaches instead) yet it is not that hard even for me to understand: surface is split into triangles, each one is defined by its vertices, rotation is performed by multiplying those coordinates by rotation matrix (which is also easy to derive), then you do projection to plane and fill. I’ve managed to make a wireframe model of the object render and after messing with barycentric coordinates I could even make textures appear on some of them.

The problem is that QfG5 engine uses a different approach: normally projected coordinates would be calculated as x/(z/k+1) where k is the distance from the viewer to the projection plane. Inside the engine this +1 bit is missing (yes, I’ve checked the assembly to be sure) which gives unexpected results. What’s worse, even when I export model data into the standard Wavefront .obj format and use a third-party 3D viewer, it fails to apply the textures properly (and sometimes the polygons themselves look very wrong). So it looks like you have to use the engine code—and it hits the same decompilation problems as above (not as bad in this case but it’s still a mix of floating- and fixed-point code with many opportunities for the decompiler to lie).

As for the in-world logic, it does not help that almost everything is hardcoded in the engine. For instance, item IDs are hardcoded and there’s a special table with item properties which contains e.g. sound IDs for viewing or equipping/using the item and message ID3 which tells you which message you should load from 101.QGM e.g. message ID3=18 means that if you load message 1,18,1,1 you’ll get short item description “Basket” while loading message 1,18,5,1 will return “This simple reed basket looks old, but well-cared for.”. Messages with just one different ID are used in many places, e.g. to tell you that your action will fail because of enemy presence instead of e.g. hunger. But the worst of them is a per-character (hero, NPCs or enemies alike) collection of tables, both integer and floating-point ones, that are used for affecting some actions. I did not even bother to find out what they affect exactly.

Well, that’s it. I don’t know what I’ll do next, maybe some small things for NihAV, maybe I’ll look at other Sierra engines to see if they’re more accessible, maybe I’ll do nothing for a while (there are some strategy games that make me waste a lot of time like Battle for Wesnoth, OpenTTD or Settlers II). In either case I had some fun REing the game and learned some things too, I can only hope the next thing I do will be similarly entertaining (and somewhat useful).

QfG5: RGD revisited

March 14th, 2024

This was the last format I haven’t looked too deep at. But as I’m tying loose ends, I decided to revisit it and see what I can get out of it.

As expected, RGD defines the 3D data for the room. I have not figured out all the additional values stored there (and I’m not sure that I got proper regions marked either) but at least the basic understanding proved out to be correct.

So, here’s what I got:

Arcane Island background

Arcane Island RGD

As you can see (especially if you click on the picture to see it in full glory), it is a flat map for the room (or rather like in Doom and Doom 2, a 3D surface consisting of differently raised polygons, pity I haven’t figured out which property is responsible for the heights). Essentially each region consists of a set of edges of different types (I marked them with different colours, you should be able to spot the blue edges around the area and some green polygons around the walkway and columns outlines). There is also a special list of marked regions (yellow ones on the picture but I’m not so sure about that) that are supposed to be walkable and there are even two connectivity matrices for them (for the record, on this particular map there are about nine hundred regions in total and only fifteen are marked as walkable).

From what I saw in the code, this map is used in the game to translate mouse coordinates into destination region, plot path to it, check the enemy line of sight (and projectiles) and so on.

Fun thing is that the format is not compact as the others (as I described it before, it’s a mess of various lists and references to other lists and so on) and so some data is unread. In some cases it is merely 4-8 bytes of padding between different blocks, in other cases it’s way more. For example, in 251.RGD you can see a piece of text “ctile Only. Region 448 contains entry point(s)” plus some other words here and there.

And as I hinted in my previous posts, that’s about it. I intend to write the final post explaining my problems with 3D rendering and why I’m giving up on it. At least I learned a lot more about my favourite game and had some fun while doing it, so it is not time completely wasted.

QfG: formulae

March 12th, 2024

Since there is not much progress with REing the engine, here I’d like to at least document some formulae from the gameplay mechanics. I realize that most of them are probably well-known already but maybe I’ll cover at least a couple of new ones.
Read the rest of this entry »

QfG5: some notes about the design

February 21st, 2024

Since I have not progressed far in the recent weeks (the current events put me out of mood) I can at least document some bits before I forget them completely. In this post I’d like to outline the overall game engine design.
Read the rest of this entry »

FFhistory: ProRes

February 5th, 2024

Apparently there’s some work been done recently on the ProRes encoders in jbmpeg with the intent not merely to fix the bugs but also to leave just one. So why not talk about the format support, it’s about as entertaining as the recent story in the project demonstrating once again that most of the problems in open-source development can’t be solved by throwing donations, even moderately large ones.

So, ProRes, the format with its history being a rather good demonstration about the project issues (CEmpeg and libav back then, and FFmpeg throughout its history in general). This tale is full of whimsy (depending on your sense of humour of course) and contains moral as well.
Read the rest of this entry »

QfG5: panorama projection

February 3rd, 2024

While I’m slowly, very slowly, approach model rendering, here’s something that is between 2D and 3D.

As I mentioned in my previous post, room backgrounds are supposed to be used as a texture on a virtual cylinder. After some investigations I figured out more details and have somewhat working code.

If you forgot geometry as much as I did, here’s a reminder: cylinder is an extruded circle and an image painted on it will be seen in a distorted way with more of the picture seen in the middle (because the distance varies) and sides of the picture being squished (because cylinder curve is at larger angle to the projection plane).

Computing how the pixels should map to the projection is a costly operation so the engine pre-computes tables for the columns to take, actual amount of that column to scale and the scaling step. As the result, during rendering stage all that is needed is to look up the column offsets for each output column, offset inside the column and scaling coefficient (both stored as 16-bit fixed-point for faster calculations). This also explains why background images are stored in the transposed format, it’s definitely easier to manipulate them this way.

Also while fiddling with all this I understood at last what the floating-point numbers in the header mean—they denote start and end angle for the panorama. Those angles are also used in limiting the distance from which the panorama may be seen.

In case of underwater panorama (with its wavy distortion) there’s yet another table in play but I’ve not touched it yet.

The main problem figuring out the code is that Ghidra has problems dealing with x87 code (and it’s hard to blame it, x87 is even worse in some aspects than x86) and I’m not eager to do it by hoof. In the code calculating those projection coefficients step -= delta * 2.0 / width was decompiled as step -= width / (delta * 2.0), I could understand that the called function is arccosine only from the context as Ghidra refused to decompile it and it also failed to recognize the case when common sub-expression was used in two subsequent calculations i.e. instead of y_offset[i] = (int)(offset * 65536.0); scale[i] = (int)((height * 0.5 - offset) * 2.0 / disp_h) - 1; it had scale[i] = (int)((height * 0.5 - extraout_ST0) * 2.0 / disp_h) - 1; where offset=(1.0 - angle * scale) * height * 0.5 but it’s not stored anywhere except in x87 register. As I said, I understand why decompiling it hard but such mistakes require either to try and reconstruct x87 code by hoof or resort to the geometry to derive the proper formulae and I don’t know which one is worse.

In either case here is an example to conclude the post:
Read the rest of this entry »