Archive for March, 2016

TrueMotion 2 RealTime

Wednesday, March 30th, 2016

I’ve been reminded that this variant of TrueMotion exists too. What do you know, it’s actually somewhat like TrueMotion 2 NoModifiers.

Essentially it’s just another fixed packing scheme like Creative YUV, Cirrus Logic CLJR or Aura. You have left prediction, deltas coded with nibbles, the usual stuff (at least blocks in TM2 were coded similarly). The only peculiar thing is that it codes data by planes with chroma planes being coded first.

I hope to add detailed description of this codec to Multimedia Wiki by the end of this week and then forget about it again.

OptimFROG

Saturday, March 26th, 2016

You know, the greatest reverse engineer I know is Derek B. He’s managed to RE such codecs as Canopus HQX and Cineform HD in the most efficient manner ever—saying he’ll do it and patiently waiting until somebody else does it.

So here are some words about his favourite lossless audio codec. The most interesting thing about it is that it was actively developed in 2001-2006 and then it was suddenly resurrected in 2015. Also it’s one of few non-standard codecs (i.e. not made into standard) that has several articles written about it.

The codec actually consists of two different formats, seemingly an old one and a newer one (that looks like it supports all range of sample type). The former is notable for having signal reconstruction stage using floating point math (a thing you don’t see in codecs every day), the latter seems to employ various parameter reading and reconstruction methods. Coding is done using low precision range coder (large values are decoded using chunks of 8 or 12 bits). So nothing really interesting there.

P.S. I’m definitely not going to write a decoder for it. There are too many lossless audio codecs already, let all proprietary ones (in custom containers too) die in peace.

TM2X: some more technical details

Sunday, March 20th, 2016

So, while I still have no idea how this codecs functions I can describe some technical details from it.

First, codec data consists of chunks with tag like 0xA00001xx and chunk size in the beginning. Some chunks are unique, some may repeat, some are alternative to each other (e.g. there are four different chunk IDs for Huffman tree description, two of them differ only in header before tree data).

Second, some smaller chunks (like 0x09 with 3-byte payload containing some decoding parameters) are obfuscated by XORing with the key derived from main chunk data. Annoying and not adding much protection really.

Third, unlike plain TrueMotion 2, TrueMotion 2X11R6 has 8×8 blocks (and not 4×4), only 3 block types (instead of 7) and single Huffman tree descriptor (instead of one per non-null block types plus one for block types itself). And it’s in a rather curious format too.

Typical TM2X (or TM2A) frame usually (i.e. for both known samples) consists of 0x06 chunk with compressed block data, some small chunks like 0x15, 0x09, two 0x02 chunks, about a dozen of 0x0B chunks and 0x0A chunk with Huffman code description.

Motion vector coding is represented in several variations: simple signed 8-bit values, MV vector of fixed bit size with bias (both are coded before MV data), some recursive MV coding for large frame areas and even the coding using Huffman coding.

And finally some notes about Huffman coding itself. I’ve not understood it properly yet but here are some notes:

  • Huffman code descriptor is actually a 2D table of 8×256 size (it’s stored in compact way in the corresponding chunk), i.e. every byte has a list of up to 8 elements corresponding to it;
  • decoding is performed by moving on the 8-element list unless an escape value is seen, then a byte is read from the input and new 8-element list is selected, and after decoding the current position is saved for later (e.g. first you read byte like 0x2A and it corresponds to a list 0, 1, 2, 0x83 — that means on subsequent decoding calls you should get 0, 1, 2, 3 and move to reading a new byte from input). Disclaimer: at least that’s how I understood it, it seems to be a reverse coding to me, i.e. assigning a variable amount of tokens to single byte of input instead of conventional assigning a variable amount of bits to the single token;
  • in some cases an additional value may be read using both the descriptor and some additional table (it’s added to the result in those cases).

TM2X: some details

Saturday, March 19th, 2016

Funny how I started this blog more than 10 years ago mostly to talk about TrueMotion 2 and now it’s TrueMotion 2X time.

First of all, an existing binary specification (feel free to ask Baidu for some other materials for this codec, I’m pretty sure you’ll receive a lot) is weird and half of it is not well decompilable. It looks like the compiler did something inverse to inlining and split out some parts of code into separate functions without usual prologue and simply accesses variables somewhere deep on stack:

(more…)

The End?

Saturday, March 19th, 2016

It’s time to say myself: you are no longer relevant. All remotely useful codecs have been reverse engineered already and most of them are obsolete. Everybody uses either H.26[45]+AAC or VP{3,8,9}+Opus and nothing else is required (even by VLC). And I grew tired too if it wasn’t obvious from my previous posts (I don’t even blame lu_zero anymore).

And thus my plans are to document Duck TrueMotion 2X and VP4, ClearVideo and hopefully VX.

In NihAV (yeah, like it’ll ever happen) I decided to implement just fringe codecs like those and no new modern codecs maybe except Bink2 and WMV3 (I know what should be done to support beta P-frames there but the libavcodec decoder is so unwieldy that my brain switches off trying to analyse how to do that change there; before you say anything it’s a part of the code written before I took over it and failed to make it into full-featured decoder during GSoC 2006).

About one FOSDEM talk…

Tuesday, March 1st, 2016

So, this FOSDEM certain Vittorio gave a talk about reverse engineering codecs, all materials are here. Here are my thoughts about it.

First of all, I gave a talk about similar topic once at VDD 12, it was my first and last public talk. Of course it was a fail and there was no single question asked (and mind you, VDD attendants are usually know multimedia much deeper than ordinary FOSDEM visitor). So I think it takes a lot of courage to give such talk so Vittorio did a good job here.

Now to the remarks about the presentation itself.

He calls himself a pupil of Kostya (slide 2) — the name is not rare so I don’t know which Kostya he meant. Definitely not me as he’s yet to show any signs he has learned something from me.

Slide 6 mentions examples of rip-off codecs and gets it wrong too. Real had some licensed codecs, RealVideo 2 is a licensed Intel rip-off of H.263 (and surprisingly Real had even one original codec, lossless audio one). Oh, and the VP family starts with VP3, before that it was Duck TrueMotion 1/2.

Categories mentioned on the next slide are rather random subsets of all video codecs out there. I’ve REd several codecs that do not belong to either category (one of them was used in Hedgewars clones called Worms BTW).

TDSC part of the talk is remarkable for two things: 5-line tool (slide 16) is something I write from heart when I’m too lazy to find the previous version of it and it actually took less time to RE it than to talk about how it was done (the main issue there was to call JPEG decoder inside TDSC decoder).

As for Canopus HQX — I’ve written about it years ago. Beside the profile-specific tables there’s actual decoding to be done (you know, VLCs, DCT, that kind of stuff). But large tables in binary specification fend off reverse engineers quite often.

So, it was a good introductory talk but I haven’t missed anything by not attending it.