Author Archive

Why codecs are designed like this and why they are not very interchangeable

Monday, August 2nd, 2021

Sometimes I have to explain the role of various codecs and why it’s pointless in most cases to adapt compression tricks from image codecs to audio codecs (and vice versa) and even from lossy to lossless codecs in the same content. If you understand that already then you’ll find no new information here.

Yours truly
Captain Obvious

Looking at Tsunami games

Sunday, August 1st, 2021

You may remember Tsunami Media as a company founded by ex-Sierra people that released a couple of games and ceased its existence.

MobyGames lists the following titles (characteristic is mine):

  • Ringworld: Revenge of the Patriarch—an adventure game based on Ringworld novel by Larry Niven. ScummVM supports its but I played it long before that.
  • Wacky Funsters! The Geekwad’s Guide to Gaming—a collection of arcade games with wacky design (I’ve only played its sequel though).
  • Protostar: War on the Frontier—a reportedly good strategy game inspired by the same source as Star Control II so they share a lot of game design (and yes, they have a common ancestor so it’s not a rip-off. I’ve never played it myself but what I saw looks interesting.
  • Blue ForcePolice Quest in anything but name. ScummVM supports it but again, I played it long before that.
  • The Geekwad: Games of the Galaxy—another collection of standard arcade games but with wacky design. I especially liked quizzes there.
  • Flash Traffic: City of Angels—one of the first interactive movies. The Mike has blogged about it.
  • Return to Ringworld—a sequel to the Ringworld obviously. ScummVM supports it (so I can re-play it and check whether that empty platform where you seek for the details is really that horrible).
  • Man Enough—FMV dating game.
  • Silent Steel—yet another interactive movie.
  • Free Enterprise—some business simulator.

As you can see, some of the adventure games are supported by ScummVM already but FMV-based ones are not which makes me wonder why.

Man Enough uses the same engine as the previous games (I checked personally. FMV sections there turned out to be animations in the same format as in Ringworld II (I hacked a quick RLB extractor and animation decoder to check that). Side note: whoever added a support for the engine was tired (for a very good reason given below) and hadn’t recognized that it uses LZW compression (so the decompressor is still slightly beautified REd code). Also in this game you have RLBs with elements following each other and those aligned to 16-byte boundaries. So maybe it’d be better just to check if you have first entry right after the library header or not (they all have TMI- header so you can’t mistake it for anything else).

Flash Traffic is special since while it uses the same engine it has all resources stored separately instead of a single library archiving them all. Additionally while TMI format they use is the same, the compression is different. Here you have LZ77-like method which can copy verbatim, copy 32-bit words from already decoded area, fill region with repeating pattern of two bytes, or simply zero region. I have a working implementation for it so I can unpack various resources from the game just fine (but beside BFI or MPEG files that have been supported since long time why should I bother?).

Silent Steel is their newer FMV game that’s mostly just one large MPEG video file and some logic around it (that’s why there was an interactive DVD re-release of the game later).

And now here’s the most important reason why Flash Traffic and Man Enough are not supported up to this day: Tsunami Media hardcoded game logic into the binaries, so while for SCI or SCUMM games you can write a virtual machine and interpret original bytecodes, here you need to decompile game logic from the DOS executable (it’s not an easy task even today) and re-implement it yourself. That’s why tsage engine in ScummVM is full of ${game_name}/${game_name}_scenesN.cpp files for each of the three games it supports. I’m pretty sure the developers won’t refuse somebody else’s contribution for the other tsage games support so you’re welcome to try.

Final news about H4M and Nightlong FMV

Thursday, July 29th, 2021

In my previous post I said I was looking at them and while some details were unknown, I got some understanding.

But it turns out The Multimedia Mike despite what he claims still hasn’t lost the knack for REing codecs—first you need to look if somebody else has done it already. And indeed, somebody else has spent two years on reverse engineering the codec. So all that was left to me was to improve its description it (which I did).

The video compression there is quite curious. Huffman-coded data chunks are not that rare (Truemotion 2 had them, for example). Huffman-coded data chunks where one chunk has tree description coded in the beginning that is used for several other chunks is more interesting. But the transforms was the juiciest part there. It turns out you have several modes of operation: fill block, raw block, smoothed DC block (that one uses DCs from neighbour blocks on all sides to create a smooth transition between them all) and the actual Adaptive Orthogonal Transform that represents block as 1-5 patterns (scaled), selected from single matrix. It reminds me a bit about SVQ1 where you had a block constructed from 2D patterns of varying size. But even more it reminds me of Matching Pursuit experimental video codec from Berkley University that also performed partial transform and selected only some bases from a large set (does anybody remember this codec at all?).

Nightlong FMV is a bit different. The format is simple but FMV2 has packed frames and there was no code for handling it in 68k version of the binary. Nevertheless I could unpack it simply by dumping frame data and extracting it with unar (PowerPacker is a strange compression format BTW, it starts reading data from the end of buffer and outputs data starting from the end as well). The unpacked data turned out to use the same format except that the frame size is now full 640×360 instead of 320×180 trying to look like twice as large. I documented it as well.

P.S. I still have some stuff to look at but more suggestions are always welcome.

REing codecs from exotic platforms with Ghidra

Wednesday, July 28th, 2021

In a recent post I asked for some stuff to RE and I got some (but more pointers are welcome).

Among those two there was H4M format for Gamecube (kudos to The Mike for providing me with the samples) and FMV format (yes, that’s the extension as well) for Nightlong game on Amiga. This finally gave me a chance to look at how Ghidra 10.0.1 fares with such exotic executables.

First I had to install special loaders for the executable formats (and hack their manifests to claim it’s for 10.0.1 and not 10.0) and also ISA support for special PPC extensions for Gamecube PowerPC CPU. After that it was just loading binaries and analysing them. And there’s nothing else to write about it, it just works.

Now some words about the formats. I’ve not fully REd them (and I don’t know how much time it’ll take either, it all depends on weather and mood) but there are some things I can say about them already.

FMV is a fixed 320×180 paletted video with 4×4 tiles. The compression is very simple: you can either have a raw 4×4 tile, sub-divided 4×4 tile where you can have only some pixels in 2×2 block updated, or skip command (for a single block or for rows). There is also fmv2 where each frame seems to be further (or instead?) compressed with PowerPacker (at least that’s how I interpret frame data starting with PP20).

H4M seems to be a rather advanced format. From what I see it has small frames combined in large blocks with each block having a previous block size stored as well (for seeking backwards like FLV, I suppose). Audio codec looks like a typical speech codec on console. Video codec looks interesting: it has B-frames for starters and it’s definitely not a H.264 rip-off. From what I could gather, it codes several data sources in the frame (i.e. each kind of data is stored separately and not interleaved with the rest of data), some of those data sources contain Huffman-packed data, others are used just to read plain bits and bytes (yes, I saw only functions that read single bit or eight bits, nothing else). Then the data from those sources is used to code coefficients and zero runs for the whole planes (I’m not sure yet if it has blocks at all).

I’ve looked just at I-frame decoding, no reconstruction or P/B-frame decoding yet, but what I’ve seen already makes it a very interesting codec to examine. I was asked to look at it many years ago but I declined that request since I had little experience with both PowerPC assembly and the executable format Gamecube uses (the only time when I looked at PowerPC assembly was Fruity Intermediate Codec and even then IIRC I finished REing it only after x86 binary became available). Now thanks to Ghidra with its decompiler and enthusiasts providing support for such fringe formats I can look at such formats too.

Rust needs proper stand-alone assembler support

Tuesday, July 27th, 2021

Back when I gave my arguments I why don’t consider Rust a mature language, one of those arguments was that is lacks proper assembler support and systems programming language requires it since some of the tasks you need to perform (including optimisation) require as low level access as you can get. Here I would like to argue why asm!{} may be enough for most cases it’s definitely not for mine.

Looking for formats to look at

Wednesday, July 21st, 2021

As I mentioned in one of the previous posts, I’ve achieved all the goals for NihAV that I initially set except for trying to write a proper encoder (and no, world domination has never been on my list). Unfortunately this will be not so easy thing to do and I’d like to have a distraction time from time.

Usually I distracted myself with reverse-engineering some format and maybe implementing decoding support for it in NihAV but recently I realized that I ran out of low-hanging fruit. There should be interesting codecs and game containers out there still waiting for their chance but I could not remember anything. I even went through ScummVM code and documented video formats from there in The Wiki, that’s how bored I was.

So I’d be grateful if somebody can point me out to a thing to RE. Last time when Peter drew my attention to VGM/XVD it turned out to be a very fulfilling experience.

On FMV games video compression

Tuesday, July 13th, 2021

Back in the day I’ve stumbled on a post called Archaic Video Compression Algorithms of FMV Games Era (in Russian) giving a concise review of VGA era video. For a long time I wanted to discuss it but I lost the link to it and managed to locate it just recently.

For those unable to read Russian here’s a gist of it: all the variety of those formats used just three methods to compress video, those methods being lossless redundancy removal via RLE or LZ77-based method (Huffman codes not employed), vector quantisation, and block truncation coding. Inter-frame compression was rather primitive, often without any motion predictions; fade-ins/outs were not handled either. And later, when computers became powerful enough, everybody switched to DCT coding and later to in-engine animation.

It is a very good first approximation but as a person who looked into several formats myself and even documented a couple of them, I have something to add or correct.

First of all, ‘archaic’ means something not merely old but also not in common use—and that does not apply neither to LZ77 (which is still the base of most of the practical compressors and archivers; and there’s WebP using its variant in lossless mode) nor vector quantisation (even if it’s mostly employed in texture compression nowadays).

Second, I consider block truncation coding to be a specialised form of vector quantisation for a case of two colours per block. And considering that the input data could be palettised already it’s more likely the encoder searched for the most fitting two colours in the palette instead of generating two RGB triplets and handing it all to the palettisation process.

Third, I’d argue that back in the day codecs did not need sub-pixel motion compensation and motion vectors were often not needed since you could simply code an offset in LZ77 compression. And some of the codecs allowed copying data from either already decoded parts of the current frame or some not yet touched part (or any part in some cases) of the previous frame. This archaic method of copying from already decoded part of the frame is present in AV1 for example (under IntraBC name).

Fourth, I think that recursive block division deserves to be mentioned there.

A minor nit would be that even if very few codecs supported explicit fading, a lot of them had support for palette change event (partial or full). Yet it mostly depended on people developing the format and the encoder for it how well it was handled.

So in my opinion the final list of the techniques most FMV games of the VGA period used should look like this:

  • lossless data compression with RLE or LZ77-based method
  • vector quantisation—both whole tiles and inside single tile;
  • recursive block splitting (either horizontal/vertical or as a quadtree);
  • and fullpel motion compensation usually done by skipping pixels, blocks, or reusing some already decoded region.

And typical video codec of that era should look like one of those two:

  1. line-based codec employing RLE, probably with skips for inter-frames and optional LZ77 compressing of whole frame data afterwards (or extending LZ77 approach to work with the forward data in the buffer as well);
  2. tile-based codec usually working with 4×4 tiles, employing vector quantisation to paint the tile with 1-8 colours, sometimes sub-dividing tile for better detail preservation, sometimes having a dictionary of tile configurations, sometimes even employing block-based motion compensation; those codecs usually employ 1-4 byte opcodes to code the tile type of whole sequences of tiles of the same type.

Of course there are deviations from those two schemes but the majority of game codecs of that era should follow either of those two designs.

Later, as the original article said, those codecs were supplanted by DCT-based ones (I’ve seen the ones based on MJPEG and MPEG-1/2; and there’s hybrid Bink). On the other hand, DCT-based Bink2 is still in use.

Overall, if you look at game formats section in the wiki you’ll see a lot of various formats and many of them are built on a handful of the idea described above while still being quite different from each other. Originality lies not only in the methods employed but also in how you combine them together. And the modern codecs are the example of how different organisations build essentially the same codec using slightly different methods. MPEG-5 LCEVC was a nice change though.

Here’s a ship being loaded

Thursday, July 8th, 2021

Since I admitted myself that I feel old, I guess it’s my solemn duty to yell at clouds time from time.

I consider satire to be the most realistic depiction of the world (unless it’s a pasquinade) and as Shakespeare put a phrase in a mouth of one of the King Lear characters, jesters do oft prove prophets.

For example, I still like to re-read various satirical pieces by Ilf and Petrov (probably the best Soviet satirists in the first half of XXth century) and one of those pieces gave a title to this post because of its similarity to what I want to talk about.

That short story mentions a game “loading a ship” (here’s the title) where a group of bored people tries to “load a ship” by calling things starting with the same pre-defined letter—lamps, Lilliputians, locomotives, liquors etc etc—until people can’t recall any more things starting with that letter or somebody suggest a name and people start to argue if that should be allowed. The story itself is about one man who metaphorically loaded a ship by inventing new and new social activities until at some point it was discovered that nobody remembers what are his direct duties should be and he was fired.

And here we transition to the Linux Foundation. Previously I thought it should promote Linux adoption, make some Linux-related standards and pay money to main Linux developers and maintainers so they can work on it full-time. But news in last couple of months showed me I was wrong.

First there was a piece of news about AgStack (Linux for Agriculture). Then there was a piece of news about Linux Foundation organising some unrelated initiative for Microsoft (while not participating in it itself). And finally there’s a piece of news about Open Voice Network, an initiative for ethical standards of voice assistants.

This made me wonder not just why I encounter such news but also what does the foundation really do. The answer did not make me more optimistic.

So it seems that Linux Foundation is transitioning from a foundation that does what I expected it to do to an organisation that does IaaS (initiative-as-a-service): you pay them and they prepare all required infrastructure to have it all running, just invite some organisations to participate. Beside that what are those member companies are paying for? I don’t know the official explanation but to me it looks like three major categories to put it very bluntly and impolitely: bribes, extortion money and indulgences. Bribes is what you pay to affect the politics of the foundation in a way favourable to you (maybe it’s called lobbying, maybe such things do not happen at all—give me facts and I’ll amend this post). Extortion money is what you have to pay to participate in various standardising activity (aka membership fees, no different from any other standardising activity). Indulgences are money you pay to avert attention from your GPL violations regarding the kernel sources (just search for GPL violators and Linux Foundation, you’ll find not just Chinese but American companies mentioned there; at least in one case the known GPL violator hasn’t published its modified kernel sources after becoming a member).

I could not find any reports about the foundation beside their 2020 annual report so I can only speculate how it has changed in the past regarding income, spending and stuff. Yet I can foresee three scenarios on how it may develop in the future:

  1. Shifting to a different activity. Even now actual Linux-related things seem to be less than a half of what the foundation does nowadays, so maybe in the future it will realise that Linux is a legacy they don’t care any longer and maybe even rename themselves to reflect their new main occupation;
  2. Fossilising. Linux Foundation may exist in the future but both its activity and being a member will become a tradition that nobody understands but they’ll keep doing it because it’s a tradition;
  3. Withering. While Linux is relatively popular kernel, nobody can guarantee it will remain popular in the future. Linus himself is not immortal and his successor may be not as skilled, we have IBM trying to replace Linux kernel with systemd while keeping the name—or maybe some other reason will hurt Linux popularity. Or some other kernel will replace it in its major niches (I can easily imagine Android being rebased on Fuchsia; the same may happen to servers as well). In result Linux and Linux-related things will become not interesting to most companies and they’ll drop their financial support.

You may say that there might be a scenario where Linux Foundation will concentrate on Linux-only stuff but I’m sceptical. AgStack is a clear signal they run out of ideas where to promote Linux but in accordance with Parkinson’s law they’ll still keep growing by inventing new stuff as long as there’s enough income to employ more people, organise more conferences and such.

As usual, I’ll be happy to be proved wrong.

NihAV: feature complete

Sunday, June 13th, 2021

A year ago I wrote a post about NihAV being conceptually done, now it’s even closer to perfection since I’ve finished writing a useful video player.

Writing it (and especially debugging it) was surprisingly hard so in most cases after looking at it I thought that I’d rather do something else—RE some game format, work on deflate support, just anything but that. But at last it’s done and working adequately. The previous player could just play video until it’s over (or deadlock occasionally), this one supports pausing and seeking plus it seems not to deadlock.

Conceptually it’s very simple as any other player concept, it’s just various features and limitations complicate the design. You simply open input, read packets, discard them or decode with a corresponding decoder and present the result (draw the frame or send audio to the sound card). Now what to do in order to make it offer all the features I wanted from it?

First, interactivity and low(er) response latency. The latter is easy to achieve, you just put audio and video decoding into separate threads so the main loop just does the demuxing and checking for user commands. So now you have user commands that might affect decoding threads (for example, seeking or going to play the next file). I implemented it by setting a special flag that decoding thread checks and discards all queued input packets until some command is received.

Another fun thing was volume control. In my audio player I simply queue samples and let SDL deal with them, here I switched to callback and fill the requested buffer with samples adjusted by current volume. This way you have volume change applied almost immediately instead of it taking effect in a second or more depending in audio queue fill.

And speaking about queues, that was also a fun thing to manage. In a default implementation (demux-send-repeat) you either end up sending all packets before the decoding thread processes them all (and this will consume a lot of memory) or you block on sending via a limited message channel (which either requires a separate thread with more complicated interaction between them all or you can forget about interactivity). The answer is obviously to keep check of how full the channel is and do not demux new packets unless you’re sure you can send them. I keep a queue for the packets or events that should be sent but there’s no space in communication channel for that yet. Additionally I make audio part report the current buffer fill so I know there’s no reason to send more packets yet. Similarly for video I keep a queue of ready to display frames (in SDL textures, so drawing them is just one SDL blit call).

Overall, nothing particularly tricky but debugging it was still not fun. I actually ended up adding a compile-time debugging feature that will dump a lot of internal playback information into debug.log so I can figure out what actually happened there (i.e. NIHing a logger but you should not be surprised by that).

Of course it still lacks a lot of features for a serious player like proper synchronisation (with automatic framedropping and audio underrun corrections), more features (like playback at different speed, taking screenshots and switching between different input streams) and GUI. But I’m fine with the current state of my player and maybe enhance it later if the need arises.

Fun thing is that my player seemed to stall on MP4s. As it turned out, the problem was in MP4 demuxer producing packets in interleaved order (one for audio stream, one for video stream, one for audio stream again, one for video stream…) while it should’ve output packets for various streams for more or less the same time position (which means usually two AAC frames between video frames instead of just one). After the change my player works as expected on MP4s as well. And if I ever get to fixing and optimising H.264 decoder it should be good enough to serve as an everyday video player (I use nihav-sndplay to listen to my music collection already).

I guess there’s just one of the original ideas (of what I wanted to try at the time of public NihAV release) left that I haven’t tried yet, namely to experiment on writing some DCT-based video encoder with rate control and such (maybe even for VP6). This should keep me occupied for a couple of years. Or at least inspire me to do something else instead.

On the Origin of Bloatware

Friday, June 11th, 2021

This is inspired by both a private discussion on why modern computing is so complex and my migration from Ubuntu 12.04LTS to systemd 20.04LTS.

Since I’ve finally changed from my less than ten years old operating system to something more modern I’ve noticed that it became noticeably slower (not irritatingly slower though but slower nevertheless) except for Firefox (which is probably not because of JS engine improvements but rather because of native execution of now supported APIs instead of polyfills). And trying various desktop environments before settling on Cinnamon I’m horrified by how bloated and unusable (to me) they are. My friends complain about modern technology demanding more effort to maintain because of complexity and weird interdependencies—while it’s supposed to make your life easier. So why it is like that?

For a keen reader the title of this post contains the answer. For the rest I’ll elaborate it below.