Archive for October, 2020

MOV — Matroska of its time

Saturday, October 24th, 2020

Disclaimer: all container formats suck, either by being too simple and tied to certain (types of) codecs, too ineffective (by wasting too many bytes of frame metadata and headers compared to other formats), or too flexible and complicated to implement in full. And there’s Ogg.

Let’s start our story with old times. Back in the day Electronic Arts made probably the only two good things it’s ever made. I’m talking of course about Deluxe Paint and IFF container format (and that happened 35 years ago; it’s a pity the company still exists). The chunked approach to storage was reused by certain other companies. Remember RIFF that was used for storing audio (RMI or WAV), video (AVI) or pictures (WebP) among other things? Remember RealMedia? Remember QuickTime MOV? That’s the thing we’re going to talk about.

As you can guess all these containers are just a series of chunks, some chunks being a container for another list of chunks. MOV and MP4 are the exception because they have atoms and boxes correspondingly. Anyway, this structure allows you to put virtually anything and while some formats used that in moderation (WAV essentially in top-level RIFF chunk with header and data chunks, there may be user metadata present and that’s about it) some others abused that possibility as much as possible (no points for guessing).

Let’s quickly review MOV structure: essentially you have moov atom with the description of the container data (which includes atoms for each individual track description of course) and mdat chunk with actual data if you’re lucky. Sounds simple? The devil hides in the deeply-nested atoms.

First, MOV has a lot of features for specific groups of things like:

  • streaming—tracks can be joined into groups to signal that only one of them should be played depending on language/quality/decoding capabilities (kinda like what RMVB was famous for);
  • mastering—all those atoms for matte, kropping! clipping and notorious edit lists;
  • DVD-like playback—the less said about track referencing feature the better.

But what I said that you have mdat chunk with actual data “if you’re lucky”? Because of the wonderful feature called data reference which tells you where the actual data is stored: it may be the same file, the same file but different resource fork (for classic MacOS), different file; you can even get some URL to the resource with the data. And of course different tracks may be stored in different data locations. Flexibility!

Then you get to actual frames (or “samples” as they’re called). For certain reasons frames of the same track can be clustered together in a block (or “chunk” as the specification calls it). And to make things better frames can have different duration stored in a different table in run-length form i.e. “first N frames have duration X, then next M frames have duration Y, then next K frames…”. So in order to extract proper frame with a correct timestamp you just need to find out in which block it resides and with which offset, sum durations for the previous frames, apply information from track metadata (don’t forget the edit list!) and you’re done. Except when you have audio in PCM format or compressed with some standard fixed compression scheme (A-law/μ-law, IMA ADPCM, MACE 3:1/6:1). Then you need to calculate duration from size. And that’s one of the things I really hate in containers: they should be codec-agnostic, in other words you should be able to seek to a certain time point (with a given precision of course), extract frame data and feed it to a decoder that does not know which container it came from either. And of course Matroska is the worst violator as it employs codec-specific ways to shave off some of the frame payload (so if you don’t know how to reconstruct the frame data the decoder won’t recognize it either).

Anyway, having such nice format with so many features not needed by anybody 99% of the time it was a perfect fit for MPEG-4 container format. So MOV was adopted, some of its terminology was changed (as mentioned before atoms->boxes) and lots of new features were added as well both in metadata and file structure (like those segments for streaming).

And if by this point the title is still not clear to you, QuickTime MOV format appeared long before Matroska but conceptually it had many of the features present in the latter as well. To put it crooked, if you replace atoms with EBML entities and add ways to reduce payload (e.g. by generating missing frame headers) you can rename MOV to MKV and nobody will notice a difference.

P.S. It’s just I’m writing a second iteration of NihAV-based video player which should not just play single file continuously and maybe deadlock in process of doing that because of some SDL audio issue but a proper player that can play multiple files, pause and seek. And while testing seeking in MOV files I’ve discovered some of those wonderful issues that inspired me to write this post.

Hacking to solve adventure game puzzles…

Tuesday, October 20th, 2020

I love adventure games (or simply quests as they’re known where I came from) but sometimes the best way to pass some moment there is to cheat.

One of such instances is The Legend of Kyrandia – Book One which is a nice game I play again sometime but there’s the infamous maze there that it not fun. And in the old times I could work around it by hex-editing a savegame to give me the ever-glowing fireberry, stones and some other quest items so I did not have to explore the full maze.

And I thought this was a thing of the past until I tried to play Galador also known as The Prince and the Coward as it’s been supported by ScummVM (or should it be called CabalVM now?) since couple of years ago and I haven’t played it yet. Mostly it’s fine but there are three moments which I could not pass at all: in two instances you must quickly pick up an object and in the third one you need to throw a stone at a certain place three times while the cursor jitters (and repeat that three times).

My reflexes never were that great to begin with but playing the game with a touchpad instead of mouse made it impossible: when a menu with actions appears after the game reacts on your click it’s already too late to select an action and click it. So one solution would be to connect mouse and try until you pass. But I got lazy during those years (back in the day I could do all those timed sequences in Space Quest II and IV while SQ3 required an utility to slow down computer for the escape from pirates sequence). So I did something different: hacked the source code to show what action happens when I actually select that “pick up” object action and added a handler so when I press 'p' key it’d do the action without bothering with menus (or crash if you don’t move cursor to the proper object). And similarly for the stone-throwing scene I removed jitter and mapped 'p' to pick up a new stone.

It’s not something I’m proud of but it should be a good demonstration how you can work around certain game limitations if you have an access to its engine source code—even if it’s not something trivial like maximum ammo/thousands of resources/infinite health.

NihAV: audio player done

Wednesday, October 7th, 2020

As I wrote in my previous post, I had functioning audio player nearing completion. And now I’ve finally added all features I wanted to add and can call it done.

While previously I mostly ranted on the bloat introduced by the components authors, here I’d like to describe the design and the reasoning behind it.
(more…)

NihAV: towards an audio player

Sunday, October 4th, 2020

So after weeks of doing nothing and looking at lossless audio codecs (in no particular order) I’m going back to developing NihAV and more particularly an audio player.
(more…)