QfG5: some words about rendering

January 4th, 2024

It is a well-known fact that Sierra implemented its 3D adventure games in parallel (and that’s not counting Dynamix RPG titles), each using its own engine with not so much things in common.

Mask of Eternity was using Direct3D/Glide (and Indeo 5 cutscenes), Gabriel Knight 3 used its custom engine with possible Direct3D backend (and Bink cutscenes as well as MP3 audio), Quest for Glory V used portable software-only 2.5D engine, combining 3D models with projected background (with custom depth map) and 2D sprites for e.g. animating water (and Cinepak cutscenes and MS ADPCM compressed audio).

So let’s look closer at how 3D rendering was done in QfG5.

Each 3D model consists of one or several meshes that form an unchangeable part of a whole model that may be manipulated independently (e.g. feet/head/torso or a hero). Rendering is done in straightforward way: calculate the orientation and position of mesh triangles, prune the back surface triangles, render each triangle into a dedicated buffer minding the depth (a separate Z-buffer is maintained for that purpose) and using the texture data to decide the pixel value.

But of course it’s not that straightforward. For starters, there are essentially two rendering modes (with several flavours of each): one draws an opaque 3D model, another one blends it with the already rendered data. Also while the renderer uses pre-calculated LUT for paletted texture pixels in order to have fast shading, it is possible to assign special LUTs for the specific meshes (which is done in the cases hardcoded in the engine). As the result glowing objects (e.g. enchanted armour or weapon) is rendered in three passes, using the same model with special LUTs applied to some meshes in certain rendering passes and some other meshes may be hidden during a rendering pass.

Room background blitting seems to be done by having pre-calculated coefficients used to define how to warp the image.

Overall, nothing especially hard so maybe I’ll get to re-implementing it eventually.

QfG5: savegame format

January 1st, 2024

So, while I’m still figuring out 3D object rendering details (it seems that various object may have their own rendering functions—up to six variants usually—so figuring it all out will take some time), here’s another random format documentation instead.
Read the rest of this entry »

QfG5: leftover formats

December 23rd, 2023

Looks like I’ve not said much about the three formats used by the game yet: ANM, RGD and STR. So I guess it’s time to rectify that.

ANM

Apparently there is only one animation format despite file magic being either 8XOV or MIRT. It starts with 4-byte magic, 32-bit header size (always 36 bytes), 16-bit animation name, 32-bit number of animations in the file, 32-bit number of animation blocks in each animation and 32-bit animation delay (usually 33 or 66 milliseconds).

Each animation block consists of two 32-bit integers (seem to be always 1 and 0 correspondingly), translation vector (three 32-bit floats) and rotation matrix (nine 32-bit floats).

I suppose animation sequences are supposed to be applied to the corresponding meshes in the model (each mesh has an animation ID in its header) and maybe I’ll see it eventually if I ever get to the rendering stage.

RGD

This is probably the most annoying format. It describes region data for the room in 3D format (I suppose) and packs a lot of different data that references other parts of the data.

Here’s a brief header description (all items are 32-bit integers):

  • always 0?
  • always 2?
  • total number of regions;
  • region data offset (each region includes among other things a 3-D vector index and an offset to a list of segment IDs);
  • offset to a list of offsets, data those offsets has a list of vector indices;
  • some ignored offset;
  • offset to an array of some region positioning information
  • total number of regions
  • offset to a full list of region IDs
  • total number of region IDs
  • data start offset (seems to be always 0x5C)?
  • number of segments
  • offset to segment data (which includes two point indices and an offset to region ID list);
  • number of points
  • offset to point data (two doubles per point)
  • number of vectors
  • offset to vector data (three doubles per vector)
  • flag signalling that the following fields are meaningful
  • number of special (walkable?) regions
  • connectivity matrix offset (that number of regions squared, -1 and -2 mean there’s no connection)
  • another connectivity matrix (in the same format) offset
  • offset to the list of special region IDs.

As you can see, indirection can get a bit deep. At least until I get my engine reimplementation to the point where I have to worry about pathfinding and hero interaction with things I don’t have to think about it (which is likely never).

STR

This is a special room format that happens only in 30 room (sub)variants. The format by itself is simple: 32-bit number of entries and pairs of 32-bit integers defining points. And as expected from the name, those are used to describe stars (i.e. shiny points in the sky of some locations).

QfG5: (some) cut content

December 19th, 2023

As I keep studying the engine code, I find hints on some planned things that were cut or not fully implemented (because game development).

I’m aware of many things like censored Nawar lines, disabled multiplayer mode, removed Glide spell and some other things being documented already so I’ll try to mention less known things.

For example, the game recognizes about thirty spells but in reality it had about ten more. One was probably deleted Glide spell, there’s something that looks like Dragon Frost spell (which acts like Dragon Fire spell but with a blue dragon head and probably frost damage). I can’t say much about the spells but considering the hints in the code there was supposed to be a completely different category of spells (normal spells have identifiers starting with 101500, paladin abilities have identifiers starting from 101600, those unknown spell IDs start from 101700 and seem to be attack spells only—maybe it’s for multiplayer mode?). For most of those there is no graphics except for one unknown spell:

Or there’s a model 98 called Sparkles which looks like a stone naked woman, I don’t remember seeing that in a game (as well as roaches). Or a blue version of a dragon (which gives only 20-50 drachmas as a loot). If I ever get to rendering stage one day I should provide pictures.

Read the rest of this entry »

RISC-V: still not ready for multimedia?

December 15th, 2023

A year ago I wrote why I’d like to be excited by RISC-V but can’t. I viewed it (and still view) as slightly freshened up MIPS (with the same problems as MIPS had). But the real question is what does it offer me as a developer of multimedia framework?

Not that much, as it turns out. RISC-V is often lauded as a free ISA any vendor can edit but what does it give me as an end user? It’s not that I can build a powerful enough chip even if hardware specifications are available (flashing an FPGA is an option but see “powerful enough” above) so I’m still dependent on what various vendors can offer me and from that point of view almost all CPU architectures are the same. The notable exceptions are russian Elbrus-2000 where instruction set documentation is under NDA (because its primary application being for russian military systems) and some Chinese chips they refuse to sell abroad (was it Loongson family?).

Anyway, as a multimedia framework developer I care about SIMD capabilities offered by CPU—modern codecs are too complex to write them in assembly and with a medium- or high-level language compiler you don’t care about CPU details much—except for SIMD-accelerated blocks that make sense to write using assembly (or intrinsics for POWER). And that’s where RISC-V sucks.

In theory RISC-V supports V extension (for variable-length SIMD processing), in practice hardly any CPUs support it. Essentially there is only one core on the market that support RISC-V V extension (or RVV for short)—C920 from T-Head and it’s v0.7.1 only (here’s a link to Rémi’s report on what’s changed between RVVv0.7.1 and RVVv1.0). Of course there’s a newer revision of that core that features RVVv1.0 support but currently there’s only one (rather underpowered) board using it and it’s not possible to buy anyway. Also I heard about SiFive designing a CPU with RVVv1.0 support but I don’t remember seeing products built on it.

And before you offer to use an emulator—emulation is skewed and proper simulation is too slow for practical purposes. Here’s a real-world example: when Macs migrated from PowerPC to x86, developers discovered that the vector shuffle instruction that was fast on PowerPC was much slower on Rosetta emulation (unlike the rest of code). Similarly there’s a story about NEON optimisations not giving any speed-up—when tested in QEMU—but made a significant performance boost on real hardware. That’s why I’d rather have a small development board (something like the original BeagleBoard) to test the optimisations if I can’t get a proper box to develop stuff on it directly.

This also rises a question not only about when CPUs with RVV support should be more accessible but why they are so rare. I can understand the problems with designing a modern performant CPU in general let alone with vector extension and on rather short term but since some have accomplished it already, why is it not more common? Particularly SiFive, if you have one chip with RVV what prevents adding it to other chips which are supposedly desktop- and server-oriented? I have only one answer and I’d like to be proven wrong (as usual): while the chip designers can implement RVV, they were unable to make it performant without hurting the rest of CPUs (either too large transistor budget or power requirements; or maybe its design interferes with the rest of the core too much) so we have it mostly on underwhelming Chinese cores and some SiFive CPU not oriented for a general user. Hopefully in the future the problems will be solved and we’ll see more mainline RISC-V CPUs with RVV. Hopefully.

So far though it reminds me of a story about Nv*dia and its first Tegra SoCs. From what I heard, the company managed to convince various vendors to use it in their infotainment systems and those who used it discovered that its hardware H.264 decoder worked only for files with certain resolutions and they somehow used a CPU without SIMD (IIRC the first Tegra lacked even FPU) so you could not even attempt to decode stuff there with a software decoder. As the result those vendors were disappointed and made a pass on the following SoCs (resulting in a rather funny Tegra-powered microwave oven). I fear that RISC-V might lose interest of the multimedia developers with both the need to rewrite code from RVVv0.7.1 to RVVv1.0 and the lack of appealing hardware supporting RVVv1.0 anyway—so when it’s ready nobody will be interested any longer. And don’t repeat again the same words about open and royalty-free ISA. We have free Theora format that sucked and was kept alive because “it’s free”—when it was improved to be about as good as MPEG-4 ASP there was a much better open and free VP8 codec available. Maybe somebody will design a less fragmented ISA targeting more specific use cases than “anything from simple microcontrollers to server CPUS” and RISC-V will join OpenRISC and others (…and nothing of the value will be lost).

P.S. Of course multimedia is far from the most important use case but it involves a good deal of technologies used in other places. And remember, SSE was marketed as something that speeds-up working with Internet (I like to end posts on a baffling note).

QfG5: model format

December 12th, 2023

Even if it looks like I’m not doing anything, that is not completely true. I’m still figuring out various bits of the engine, trying to make a whole picture of its workings. In either case, here’s a bit more of documentation.

Models seem to come in two flavours, distinguished by the first 4 bytes: MIRT (common format) and 8XOV (used in about a dozen of files; the format is almost the same but texture-related fields are absent). Animation files also use the same IDs but with a different format and there 8XOV files are more common than MIRT ones.

Each MDL file begins with the following header:

  • 4 bytes—ID (MIRT/8XOV);
  • 32-bit int—header size;
  • 32-bit int—model ID;
  • 16 bytes—ASCIIZ model name;
  • 32-bit int—number of meshes in the file;
  • 32-bit int—number of textures in the file (MIRT only);
  • 32-bit int—unknown;
  • 1024 bytes—palette in ABGR format;
  • 32-bit int—offset to texture data (MIRT only);
  • Nx32-bit ints—offsets to the each mesh data.

Mesh data has the following header:

  • 16-byte ASCIIZ part name;
  • 20×32-bit floats—unknown;
  • 32-bit int—number of vertices;
  • 32-bit int—number of texture vertices;
  • 32-bit int—number of triangles;
  • 32-bit int—offset to the vertices data (from the mesh data start, should be 0x7C);
  • 32-bit int—offset to the texture vertices data;
  • 32-bit int—offset to the face (triangle) data;
  • 32-bit int—offset to some unknown data.

Vertices data are triplets of 32-bit floats. Texture vertices are pairs of 32-bit floats. Face data consists of a triplet of 32-bit ints with vertex indices, a triplet of 32-bit texture vertex indices, a 32-bit int texture number and a triplet of 32-bit floats with unknown values. Unknown data consists of quads of signed 32-bit integers (with the same number as the vertices) where the last element seems to be always zero.

Texture data is stored in the following format: first there is an array of 32-bit ints with an offset to the start of each texture (e.g. for single texture this offset will be 4), followed by texture header (32-bit float texture width, 32-bit float texture height, 32-bit int width power of two factor, 32-bit int height power of two factor, 32-bit int width mask and 32-bit int height mask) and texture data (palette indices for each pixel) for each of the textures.
E.g. for 128×64 texture the header values will be 128.0, 64.0, 7, 6, 127, 63.

While I’m yet to render it myself, I’ve written a small tool to covert data into Wavefront .obj format that various 3D viewers understand. Here’s a rendered part of 063.MDL as an example:

Beside the head, the model file also contains meshes for: body, tail, up and lower arms (both right and left), hands, shins, feet—13 objects in total. All using parts of the same 512×256 texture.

Another fun FFork

December 8th, 2023

So apparently Paul B. Mahol had enough and finally split off (considering how his previous claims were usually not followed by immediate actions, I decided to wait for a day to make sure it’s different this time—and it is).

The most curious thing is his friend Nicolas not calling him a Libav spy despite his background and recent commits in librempeg resembling what Libav did back in the day. Maybe he really has a soft spot for Paul.

In either case I wish Paul to achieve his goals unhindered and more developers to follow his example. You don’t have to stick somewhere suffering just because you see no alternative—sometimes it is easy enough to create your own (the statement does not apply to masochists and people doing paid work).

QfG5: GUI

November 30th, 2023

While I’m still looking at various bits of the engine, here’s yet another post about the part I more or less understand.

Overall, the engine was designed to be portable as it was supposed to work both on little-endian PC and big-endian Mac. That is why PC version essentially uses just minimum OS and DirectX interfaces to perform the necessary stuff (event tracking, drawing and audio playback; multi-player game would require an additional library for the network stuff but since it was cut we’re not talking about it).

The engine essentially draws the 3D world by default and maintains a stack of windows (e.g. a death screen, a main menu or various dialogues) to be shown over it. Windows are organised as a rectangle on the screen with optional background (loaded from GRA file) and a collection of widgets on it. Each widget is essentially a rectangle inside window with a custom code to draw it and maybe handle some custom events (for example, report when it was clicked). Essentially the main screen class tracks keyboard and mouse events and passes them to the current window. It also handles redraw events (by calling a corresponding function from the window interface) and when a widget is clicked, it also invokes a function to tell the window which widget was clicked and to act on it. Also if a widget specifies tooltip/hint text it will be drawn too when a mouse stays in certain point long enough and there’s a widget interface function provided. It sounds rather simple but seems to work good enough for the game, even drag-and-drop combining items in inventory.

Widgets may be of different type: buttons (background from GRA plus button caption loaded from QGM and rendered with one of QGF fonts), mere image or text label, radio buttons, input boxes (black rectangle plus rendered text) and even model rendering (for a character window).

Windows are mostly hard-coded in the engine with all their creation and handling logic, including room-specific ones like the Adventurers Guild bulletin board, the crane controls on the Scientific Island, Wheel of Fortune minigame in Dead Parrot Inn and such. Nevertheless the engine provides an interface to create a dialogue window with up to four buttons so some rooms should be able to construct their own (maybe, a quiz for Scientific Island?).

Slowly (very slowly!) more and more details about the engine become known to me but the core of the game (3D world rendering and interaction) is terra incognita to me. Let’s see if I ever get there…

QfG5: messages

November 24th, 2023

Here I’d like to review two formats related to the messages in the game: font files and message files. There’s still a question of how lipsync files work (so far it looks like a series of 16-bit variables probably telling which sprite from corresponding GRA file to show; I may get to that eventually). Update: as one should expect, lipsync format is a lot like it was in SCI—a series of 16-bit time positions (in tertias aka 1/60th of a second) and sprite IDs. Additionally there’s another table of equal size right after the end of that data, its meaning still unknown.

QGF

This is a rather simple bitmap font format. It starts with a header that has the following 32-bit values: maximum character width, character height, space between characters, unknown value, flag telling whether it is a complex pseudo-3D font or a simple line font, another flag (probably for a shadow).

Then there is an array of 512 bytes containing the character widths (only 16-256 range is populated though) followed by an array of 512 32-bit words telling the offset to the character data.

Character data consists of pair of bytes. If the first byte has negative value, it tells you how many pixels to skip, otherwise it is the current pixel opacity (for the fonts with the corresponding flag set where 31 = fully opaque pixel, for simple fonts any non-negative value is an opaque pixel). The second byte value seems to be completely ignored.

QGM

These files contain text messages of various kinds: text spoken by characters, narrator’s text, dialogue options and even all the text in user interface.

Message files start with " MGQ" magic, 32-bit file version (only version 3 and 4 are supported), 32-bit number of message blocks in the file, some unknown 16-bit value and 16-bit file ID (e.g. 160 for 160.QGM).

Then message blocks with 32-bit header and variable-size payload follow. Message block header starts with four 16-bit variables that are used as message identifier (and also used to generate the name for speech/lipsync files, more on that below). They are followed by four unknown 16-bit fields (first of them is probably ID of the character saying the lines), 16-bit number of dialogue options, 16-bit flags (flag 4 means the message contents are obfuscated), another unknown 16-bit field, 16-bit internal message number (unordered), 16-bit message length and 16-bit flag for string message label presence.

The header is followed by optional 13-byte message label (which looks like a filename), dialogue options in the same format (if present) and optional message text. There’s an additional 32-bit number at the end of each message block with unclear meaning which may be related to the message ID).

Before going on message obfuscation scheme and name generation I’d like to talk about the structure. There are essentially two kinds of message types: normal text and dialogue trees that mostly contain links to the other message text.

For example, in the same Arcane Island location message block 2 looks like this:

  • option 1 = A0BJ020S.021
  • option 2 = A0BJ0208.0E1
  • option 3 = A0BJ0208.0F1
  • option 4 = A0BJ0208.0G1
  • option 5 = A0BJ0208.0H1
  • message = “He that wishes to pass through me,
    First must answer questions three.”

And those IDs point at the other message blocks:

  • message block 0 with text ‘Say “%s.”‘ (i.e. hero’s name)
  • message block 13 with text ‘Say “King Arthur of Pendragon.”‘
  • message block 15 with text ‘Say “Putentane.”‘
  • message block 16 with text ‘Say “Sir Robin-the-Not-So-Brave.”‘
  • message block 17 with text ‘Say “Oh, no, not again.”‘

(Also those message blocks have optional message label set to the ID of the message block 2, probably for the easier return. In other files a character’s reply may have the same label set as well.)

Those familiar with the game may remember it as the first question the cloud gargoyle asks the hero and possible replies to it.

Now about the message IDs. Those are generated from the QGM ID and four integers using the following format: A(3-character QGM ID)(2-character ID1)(2-characted ID2).(2-characted ID3)(1-character ID4). Integers are converted using base 36 i.e. numbers and uppercase letters e.g. 415 gets converted to 0 11 19 and coded as 0BJ. If audio part is present, it has the same name. Lipsync data is the same as well but uses 'S' as the first letter in the file name instead.

And as for the message text obfuscation, if the corresponding field has bit 2 (like in the majority of the message files), then it should be de-obfuscated using the following algorithm:

  1. split data into 4-byte chunks and tail 0-3 bytes long;
  2. for each 4-byte chunk repeat steps 3-6:
  3. read those bytes as 32-bit little-endian number;
  4. exclusive-or the value with constant 0xf1acc1d;
  5. rotate value cyclically left by 15 bits;
  6. store 32-bit number back as four bytes;
  7. invert bits in all bytes of the tail.

I’d call this scheme lame but the constant speaks for itself.


And as bonus for those who care here are the extracted font files (under the cut):
Read the rest of this entry »

QfG5: room image formats

November 23rd, 2023

Each room has several formats associated with it, some of them are binary, describing various objects but here I’ll describe the following formats:

  • IMG
  • NOD
  • ZZZ
  • FTR
  • ROM
  • GRA

GRA format is a more general format used for animated sprites, some window backgrounds and GUI elements as well as talking character portraits.

NOD

This has nothing to do with C&C is a palette format for the room backgrounds. It starts with QFG5 magic and 32-bit file size but in reality the game engine simply reads 1024-byte RGBA palette starting at the offset 0xA8 and that’s it.

IMG

This is actual room background data. It starts with a header consisting of two 32-bit big-endian words and two 64-bit big-endian floating-point numbers and then the same header repeated in little-endian format. The parameters are height, width, two parameters probably related to the room positioning (probably an offset and full intended room perimeter—the rooms can be circular like the Silmaria main square) and two unknown floating-point numbers.

Then RLE-compressed image follows coded in columns starting from the right edge and with its height doubled (so e.g. 4000×400 image will be decoded as 800×4000 image with the image left edge being at the bottom). The lower part of image is not used (maybe a leftover when it was used for a depth buffer). The actual data is 8-bit indices to the NOD palette with the same resource number.

RLE works in the following way: for each decoded line (or rather column) a signed 8-bit value is read; zero signals the end of line, negative values signal raw data (e.g. 0xFE or -2 means copy two following bytes to the output), positive values mean repeating the next byte the specified number of times.

ZZZ

This is a room depth map. This file contains RLE-compressed depth values (0 = closest to the screen, 255 = farthest) without any header. Data is compressed in the same way as single line of IMG format and has the same dimensions but it is stored in line-by-line format right edge first (i.e. mirrored compared to the background image);

FTR

This is a format defining room regions. It consists of signed 16-bit numbers. First number declares the number of regions in the file, then region data follows consisting of 4-word header (region depth maybe, always zero, an exit portal flag, and the number of points) and the region points (two 16-bit words each).

And here is an example of decoded room images.

Arcane Island background


Arcane Island depth map (and an Easter egg)


Arcane Island region map

ROM

This is a room properties file consisting of two integers and two floats: first field (32-bit integer) seems to be a big-endian version of the third field, second field (32-bit float) is unknown, third field (32-bit integer) declares the number of additional resources that should be loaded for the room (e.g. battle arena has three alternative views), fourth field (32-bit float) seems to specify an angle increase (for circular rooms, I suppose). In either case floating-point numbers seems to be non-zero only for the room 200 (Silmaria main square).

GRA

This is a format used for animated sprites and static images in the various parts of the game.

The file starts with 32-bit image coding mode, 32-bit number of sprite collections, 512-byte palette (in RGB555 format) followed by the 32-bit offsets to the sprite collection data,

Each sprite collection starts with the header containing the following 32-bit values: horizontal sprite position, vertical sprite position, width, height, number of sprites in the collection, delay between frames and flags. It is followed by the offsets (from the sprite collection data start) to the individual frames. Depending on image coding mode frames can be stored in the following way:

  1. mode 0—unpacked image data;
  2. mode 1—unpacked image data interleaved with depth values (i.e. in the following order: palette index byte, depth byte, palette index byte, depth byte and so on);
  3. mode 2—the same RLE compression as in IMG format;
  4. mode 3—the same as the previous mode but index 0xFF is used for the transparent pixel;
  5. mode 4—similar to the previous mode but value 0xFF signals that the original background pixel should be restored instead.

And here’s an example of a sprite from the same scene (in the same orientation as actual scene background):

I’ll probably try to cover the messages and speech next (font formats, message files and lipsync) but it may take more time. And then only 3D data will be left for figuring out. It’s a pity I don’t know much about 3D though…