TwilightMotion Saga: Random pre-VP3 Bits

April 16th, 2016

TrueMotion 1 was licensed and has several variants outside the usual TM1. There’s allegedly Horizons PowerEZ but only j-b would know anything about it—because it’s vintage and used to code content he’s interested in of course. The other version was used for intro and victory cutscenes in Star Control II: Ur-Quan Masters 3DO version, the source code is available so any Mike Melanson out there can have a look at it. To me it looked as the same coding algorithm but with custom delta tables and codebooks provided. Oh, and data is split between several files (global header, codebook, frame data and offsets to individual frames).

TrueMotion 2 Realtime seems to be really Truemotion 1.2 Realtime Edition. It has quite similar header format to TrueMotion 1 (same obfuscation even) but with some values that would make TM1 decoder bail out on error and it was released before actual TrueMotion 2.

TrueMotion 2X seem to return to coding method from TM1 as well since there’s a suspicious similarity between its inverse Huffman coding method (they call it “string encoder” which sounds somewhat even more confusing) and the codebook used in TM1 except that in TM2X they use 0x80 as the end of data flag instead of 0x01.

P.S. I should really move to VP4 and then away from this codec family altogether.

A Quick Look on IMM4

April 10th, 2016

So I’ve spent an hour or so to look at IMM4.

What do you know, it’s a very simple IDCT codec with interframes. Intraframes have only DCT with usual run-level VLC coding, interframes have skip flag to tell whether this macroblock should be skipped or there’s a difference to the previous frame coded or intra block. See, no motion vectors, quantisation is single value per block (except for DC in intra block), there seems to be no zigzagging either. You cannot get much simpler than that.

TwilightMotion Saga 2X

April 9th, 2016

Okay, now it should be the last post about TM2X.

It’s hard to believe but looks like there were at least five versions of this codec that can be distinguished by the chunk ID where frame information is stored (I have decoder for versions 1-5 and all known samples are version 4). So in version 5 they’ve added coding of motion vectors for 8×8 blocks in various forms including quadtree (and that’s what confused me). Looks like there are tile dimensions stored in configuration chunk (0xA0000109) and codec operates on those.

Again, looks like decoder first calls a function to determine what to do with a row of blocks and then corresponding functions decoding (sub)block data. And I was confused by those too—some of the functions read luma and chroma, some functions read only chroma and some read luma, chroma and two other unidentified values of different types (so it’s not a motion vector). They always have 2 luma samples (if present) and 1/2/4/8 chroma samples. Or is it the other way round with two chroma samples and 1-8 luma samples?

What the Duck, On2, couldn’t you opensource TR20 and TM2X/TM2A along with TM1, TM2 and TM VP3 (and they were all in the same package, mind you)?

In any case I’ll try to forget it again, there’s still VP4 (aka AOM codec -5).

How the codecs should emerge (hint: without .ebuilds)

April 6th, 2016

So it has come to this, some events and discussions made me write this post.

How I imagine the perfect process for new codecs? It’s rather simple model: you have some places where ideas and enthusiasts swarm and from their work and selecting best ideas new candidate codec is born.

There are such places for all codec types: audio enthusiasts can find testers at Hydrogenaudio, video enthusiasts can talk at Doom9, general and image compression people seem to be present at encode.ru. In first approximation it works as expected—people propose ideas, test new compression programs and report benchmarks, suggest improvements. What can be wrong there? Just one thing: people making software incompatible with anything else (custom containers/archive formats) and trying to push it on everybody. After you invent some format make sure it works in some standard environment (for compressors it’s usually single file compression mode, .tar.xz seems to be more popular than .7z even if they use the same LZMA algorithm; for codecs it should be the standard container—even Matroska would do). And document the format too—properly instead of usual “bug off” level.

There are standardised codecs that undergo similar process: various companies or researchers submit their work, a base for a new standard is chosen, new proposals try to improve it. And then companies start to push their patented shit there and that’s where the system goes wrong (QMF in MPEG Audio Layer III anyone?). It’s not better when some company tries to push its product as a standard without any evaluation (and thus we get wonderful line of SMPTE VC-x codecs for instance).

And there’s OggXiph. This is again a community that designs codecs mostly because they can and pushes them mostly because they’re Free™ and OpenSource™ and they mostly suck otherwise: Ogg format is for streaming not good for anything, most people still don’t know that it’s Ogg/FLAC because it was developed outside (and has horrible raw stream format), Speex has no readable specification and easier understood with disassembling the library rather than reading source code, Theora is an outdated enterprise grade code, Opus has its issues (but it’s rather good, one cannot deny that), Daala will probably never happen.

And what do I see in recent news? Alliance for Open Media plans to release first draft of their codec soon and it is:

  • hosted on baidusource.com;
  • for now just libvpx with some names changes;
  • everything else about it screams Baidu too.

It if looks like Duck, produces codecs like Duck and has the same source code as Duck, then it probably is DuckOn2Baidu.

At least in the old times there was some competition of ideas in codecs so one could choose between different codecs giving good results—and in some cases they were available for various ecosystems too (e.g. Indeo was present in AVI and MOV, ClearVideo managed to get into AVI, MOV and RM). Now it’s just foam of lossless codecs that even their authors forget about next year and one or two companies pushing their stuff on everybody. And that makes me sad.

TM2X Woes

April 3rd, 2016

I don’t know what I should write about this codec.

TM2X (or TM2A, they are really identical) differs in design from TM2 Vanilla. The main principle seems to stay the same for TM2, TM2X and TM2RT — they all operate on delta coding from the previous delta and top neighbour. But while for TM2 it’s always 4×4 blocks, for TM2RT it’s the whole plane, for TM2X it seems to be variable block size (i.e. it can be 8×8 block or even larger). TM2 uses classical Huffman coded data (with tree description and such) one per each block type, TM2RT uses fixed size deltas (2-, 3- or 4-bit), TM2X uses inverse Huffman lists (i.e. each byte codes a list of values which you’re supposed to read sequentially). And for TM2 there was source code (horrible C but source code nevertheless), TM2RT had compact and rather sane binary specification, TM2X has only an insane binary specification. How insane? For starters, it uses obfuscation for some chunks that’s tedious to undo by hand (unlike TM2RT), it has internal design relying on calling on array of virtual functions and those seem to treat esp as “Eh, Structure Pointer” which will confuse any decompiler.

Thanks to that I was unable to reconstruct all the decoding logic but at least some facts seem to be more or less clear:

  • decoding seems to vary greatly depending on decoder configuration provided in corresponding chunks (since those values are used to build function pointer arrays);
  • there’s lots and lots of block decoding functions that read different amount of deltas per 8 or 16 pixels, e.g. there can be 3 or 5 deltas per 8 pixels;
  • all decoding functions use the same inverse Huffman list but there are different ways to remap its output: there are delta value mapping tables for luma and chroma, generic value decoding uses special escape value to signal that its decoding is not done yet etc;
  • motion compensation is indeed uses halfpel precision.

So I’ll probably just forget about this codec and move to VP4 and then forget about all these turkeyduck codecs. I fear that ClearVideo will be abandoned on the similar level too. Well, at least there’s a lot of speech codecs to talk about.

TrueMotion 2 RealTime

March 30th, 2016

I’ve been reminded that this variant of TrueMotion exists too. What do you know, it’s actually somewhat like TrueMotion 2 NoModifiers.

Essentially it’s just another fixed packing scheme like Creative YUV, Cirrus Logic CLJR or Aura. You have left prediction, deltas coded with nibbles, the usual stuff (at least blocks in TM2 were coded similarly). The only peculiar thing is that it codes data by planes with chroma planes being coded first.

I hope to add detailed description of this codec to Multimedia Wiki by the end of this week and then forget about it again.

OptimFROG

March 26th, 2016

You know, the greatest reverse engineer I know is Derek B. He’s managed to RE such codecs as Canopus HQX and Cineform HD in the most efficient manner ever—saying he’ll do it and patiently waiting until somebody else does it.

So here are some words about his favourite lossless audio codec. The most interesting thing about it is that it was actively developed in 2001-2006 and then it was suddenly resurrected in 2015. Also it’s one of few non-standard codecs (i.e. not made into standard) that has several articles written about it.

The codec actually consists of two different formats, seemingly an old one and a newer one (that looks like it supports all range of sample type). The former is notable for having signal reconstruction stage using floating point math (a thing you don’t see in codecs every day), the latter seems to employ various parameter reading and reconstruction methods. Coding is done using low precision range coder (large values are decoded using chunks of 8 or 12 bits). So nothing really interesting there.

P.S. I’m definitely not going to write a decoder for it. There are too many lossless audio codecs already, let all proprietary ones (in custom containers too) die in peace.

TM2X: some more technical details

March 20th, 2016

So, while I still have no idea how this codecs functions I can describe some technical details from it.

First, codec data consists of chunks with tag like 0xA00001xx and chunk size in the beginning. Some chunks are unique, some may repeat, some are alternative to each other (e.g. there are four different chunk IDs for Huffman tree description, two of them differ only in header before tree data).

Second, some smaller chunks (like 0x09 with 3-byte payload containing some decoding parameters) are obfuscated by XORing with the key derived from main chunk data. Annoying and not adding much protection really.

Third, unlike plain TrueMotion 2, TrueMotion 2X11R6 has 8×8 blocks (and not 4×4), only 3 block types (instead of 7) and single Huffman tree descriptor (instead of one per non-null block types plus one for block types itself). And it’s in a rather curious format too.

Typical TM2X (or TM2A) frame usually (i.e. for both known samples) consists of 0x06 chunk with compressed block data, some small chunks like 0x15, 0x09, two 0x02 chunks, about a dozen of 0x0B chunks and 0x0A chunk with Huffman code description.

Motion vector coding is represented in several variations: simple signed 8-bit values, MV vector of fixed bit size with bias (both are coded before MV data), some recursive MV coding for large frame areas and even the coding using Huffman coding.

And finally some notes about Huffman coding itself. I’ve not understood it properly yet but here are some notes:

  • Huffman code descriptor is actually a 2D table of 8×256 size (it’s stored in compact way in the corresponding chunk), i.e. every byte has a list of up to 8 elements corresponding to it;
  • decoding is performed by moving on the 8-element list unless an escape value is seen, then a byte is read from the input and new 8-element list is selected, and after decoding the current position is saved for later (e.g. first you read byte like 0x2A and it corresponds to a list 0, 1, 2, 0x83 — that means on subsequent decoding calls you should get 0, 1, 2, 3 and move to reading a new byte from input). Disclaimer: at least that’s how I understood it, it seems to be a reverse coding to me, i.e. assigning a variable amount of tokens to single byte of input instead of conventional assigning a variable amount of bits to the single token;
  • in some cases an additional value may be read using both the descriptor and some additional table (it’s added to the result in those cases).

TM2X: some details

March 19th, 2016

Funny how I started this blog more than 10 years ago mostly to talk about TrueMotion 2 and now it’s TrueMotion 2X time.

First of all, an existing binary specification (feel free to ask Baidu for some other materials for this codec, I’m pretty sure you’ll receive a lot) is weird and half of it is not well decompilable. It looks like the compiler did something inverse to inlining and split out some parts of code into separate functions without usual prologue and simply accesses variables somewhere deep on stack:

Read the rest of this entry »

The End?

March 19th, 2016

It’s time to say myself: you are no longer relevant. All remotely useful codecs have been reverse engineered already and most of them are obsolete. Everybody uses either H.26[45]+AAC or VP{3,8,9}+Opus and nothing else is required (even by VLC). And I grew tired too if it wasn’t obvious from my previous posts (I don’t even blame lu_zero anymore).

And thus my plans are to document Duck TrueMotion 2X and VP4, ClearVideo and hopefully VX.

In NihAV (yeah, like it’ll ever happen) I decided to implement just fringe codecs like those and no new modern codecs maybe except Bink2 and WMV3 (I know what should be done to support beta P-frames there but the libavcodec decoder is so unwieldy that my brain switches off trying to analyse how to do that change there; before you say anything it’s a part of the code written before I took over it and failed to make it into full-featured decoder during GSoC 2006).