Archive for the ‘TrueMotion’ Category

TwilightMotion Saga 2X

Saturday, April 9th, 2016

Okay, now it should be the last post about TM2X.

It’s hard to believe but looks like there were at least five versions of this codec that can be distinguished by the chunk ID where frame information is stored (I have decoder for versions 1-5 and all known samples are version 4). So in version 5 they’ve added coding of motion vectors for 8×8 blocks in various forms including quadtree (and that’s what confused me). Looks like there are tile dimensions stored in configuration chunk (0xA0000109) and codec operates on those.

Again, looks like decoder first calls a function to determine what to do with a row of blocks and then corresponding functions decoding (sub)block data. And I was confused by those too—some of the functions read luma and chroma, some functions read only chroma and some read luma, chroma and two other unidentified values of different types (so it’s not a motion vector). They always have 2 luma samples (if present) and 1/2/4/8 chroma samples. Or is it the other way round with two chroma samples and 1-8 luma samples?

What the Duck, On2, couldn’t you opensource TR20 and TM2X/TM2A along with TM1, TM2 and TM VP3 (and they were all in the same package, mind you)?

In any case I’ll try to forget it again, there’s still VP4 (aka AOM codec -5).

TM2X Woes

Sunday, April 3rd, 2016

I don’t know what I should write about this codec.

TM2X (or TM2A, they are really identical) differs in design from TM2 Vanilla. The main principle seems to stay the same for TM2, TM2X and TM2RT — they all operate on delta coding from the previous delta and top neighbour. But while for TM2 it’s always 4×4 blocks, for TM2RT it’s the whole plane, for TM2X it seems to be variable block size (i.e. it can be 8×8 block or even larger). TM2 uses classical Huffman coded data (with tree description and such) one per each block type, TM2RT uses fixed size deltas (2-, 3- or 4-bit), TM2X uses inverse Huffman lists (i.e. each byte codes a list of values which you’re supposed to read sequentially). And for TM2 there was source code (horrible C but source code nevertheless), TM2RT had compact and rather sane binary specification, TM2X has only an insane binary specification. How insane? For starters, it uses obfuscation for some chunks that’s tedious to undo by hand (unlike TM2RT), it has internal design relying on calling on array of virtual functions and those seem to treat esp as “Eh, Structure Pointer” which will confuse any decompiler.

Thanks to that I was unable to reconstruct all the decoding logic but at least some facts seem to be more or less clear:

  • decoding seems to vary greatly depending on decoder configuration provided in corresponding chunks (since those values are used to build function pointer arrays);
  • there’s lots and lots of block decoding functions that read different amount of deltas per 8 or 16 pixels, e.g. there can be 3 or 5 deltas per 8 pixels;
  • all decoding functions use the same inverse Huffman list but there are different ways to remap its output: there are delta value mapping tables for luma and chroma, generic value decoding uses special escape value to signal that its decoding is not done yet etc;
  • motion compensation is indeed uses halfpel precision.

So I’ll probably just forget about this codec and move to VP4 and then forget about all these turkeyduck codecs. I fear that ClearVideo will be abandoned on the similar level too. Well, at least there’s a lot of speech codecs to talk about.

TrueMotion 2 RealTime

Wednesday, March 30th, 2016

I’ve been reminded that this variant of TrueMotion exists too. What do you know, it’s actually somewhat like TrueMotion 2 NoModifiers.

Essentially it’s just another fixed packing scheme like Creative YUV, Cirrus Logic CLJR or Aura. You have left prediction, deltas coded with nibbles, the usual stuff (at least blocks in TM2 were coded similarly). The only peculiar thing is that it codes data by planes with chroma planes being coded first.

I hope to add detailed description of this codec to Multimedia Wiki by the end of this week and then forget about it again.

TM2X: some more technical details

Sunday, March 20th, 2016

So, while I still have no idea how this codecs functions I can describe some technical details from it.

First, codec data consists of chunks with tag like 0xA00001xx and chunk size in the beginning. Some chunks are unique, some may repeat, some are alternative to each other (e.g. there are four different chunk IDs for Huffman tree description, two of them differ only in header before tree data).

Second, some smaller chunks (like 0x09 with 3-byte payload containing some decoding parameters) are obfuscated by XORing with the key derived from main chunk data. Annoying and not adding much protection really.

Third, unlike plain TrueMotion 2, TrueMotion 2X11R6 has 8×8 blocks (and not 4×4), only 3 block types (instead of 7) and single Huffman tree descriptor (instead of one per non-null block types plus one for block types itself). And it’s in a rather curious format too.

Typical TM2X (or TM2A) frame usually (i.e. for both known samples) consists of 0x06 chunk with compressed block data, some small chunks like 0x15, 0x09, two 0x02 chunks, about a dozen of 0x0B chunks and 0x0A chunk with Huffman code description.

Motion vector coding is represented in several variations: simple signed 8-bit values, MV vector of fixed bit size with bias (both are coded before MV data), some recursive MV coding for large frame areas and even the coding using Huffman coding.

And finally some notes about Huffman coding itself. I’ve not understood it properly yet but here are some notes:

  • Huffman code descriptor is actually a 2D table of 8×256 size (it’s stored in compact way in the corresponding chunk), i.e. every byte has a list of up to 8 elements corresponding to it;
  • decoding is performed by moving on the 8-element list unless an escape value is seen, then a byte is read from the input and new 8-element list is selected, and after decoding the current position is saved for later (e.g. first you read byte like 0x2A and it corresponds to a list 0, 1, 2, 0x83 — that means on subsequent decoding calls you should get 0, 1, 2, 3 and move to reading a new byte from input). Disclaimer: at least that’s how I understood it, it seems to be a reverse coding to me, i.e. assigning a variable amount of tokens to single byte of input instead of conventional assigning a variable amount of bits to the single token;
  • in some cases an additional value may be read using both the descriptor and some additional table (it’s added to the result in those cases).

TM2X: some details

Saturday, March 19th, 2016

Funny how I started this blog more than 10 years ago mostly to talk about TrueMotion 2 and now it’s TrueMotion 2X time.

First of all, an existing binary specification (feel free to ask Baidu for some other materials for this codec, I’m pretty sure you’ll receive a lot) is weird and half of it is not well decompilable. It looks like the compiler did something inverse to inlining and split out some parts of code into separate functions without usual prologue and simply accesses variables somewhere deep on stack:

(more…)

Some notes on VP4

Sunday, March 1st, 2015

Well, this information should’ve been posted by someone else but those people seem to be lazier than me. In return I’m not going to use XViD or FLIC for encoding my content.

So, REing VP4 is rather easy – you just download original VP3.2 decoder source (still available at Xiph SVN servers) and compare it to the structure in vp4vfw.dll. There are differences in structures and a bit in code layout but mostly it’s the same code with new additions.

So, VP4 is based on VP3 (surprise!) and introduces a new bitstream version (which is 3 for some reason). Here’s an incomplete list of differences I’ve spotted:

  • Base frame header has some additional fields (I didn’t care enough to decipher their meaning though);
  • Superblock coding uses a bit different scheme with new universal codes resembling exp-Golomb but with VP4 quirk;
  • Frame data decoding differs for frame types;
  • Motion vector component extraction uses Huffman tables and sign from the previous block.

And yet it uses the same coding principles and even token coding seems to be left untouched. It was suspected for a long time that even-numbered On2 codecs were simply an improvements over previous version while odd-numbered On2 codecs were more innovative but not much was known about VP4 to prove it:

  1. Duck TrueMotion 1 — a new codec;
  2. Duck TrueMotion 2 — mostly like TrueMotion 1 but with Huffman encoding;
  3. Duck/On2 TrueMotion VP3 — DCT + static Huffman coding;
  4. On2 TrueMotion VP4 — VP3 with some bitstream coding changes;
  5. On2 TrueCast VP5 — DCT + arithmetic coder;
  6. On2 VP6 — VP5 with some bitstream changes;
  7. On2 VP7 — H.264 ripoff with their own arithmetic coder;
  8. On2 VP8 — VP7 with some small changes;
  9. Baidu VP9 — H.265 ripoff with their own arithmetic coder;
  10. rumoured Baidu VP10 — since there’s no H.266 in the works for now…

It’s all kinda Intel CPUs but without confusing codenames (and Xiph hasn’t produced too many codecs to confuse whether Daalawell came before Theorabridge or after).

P.S. Many thanks to big G for releasing no information on that codec or any other codecs from On2. Oh, and is VP9 “specification” still under NDA?

P.P.S. I should really work on a game codec named after chemical warfare instead.

TM2 and Final Fantasy

Friday, October 14th, 2005

There are a lot of variants of first two TrueMotion codecs. Final Fantasy (some number here) for PC uses TM2.

Differences from reference decoder: delta table size must be greater (64 instead of 32) and sometimes there is no data in stream, which means you should fill it with value from Huffman table).

Well, there is also TM2X awaiting.

TM2: Almost Complete

Tuesday, October 4th, 2005

My TrueMotion 2 decoder is almost complete. It just has some problems with chroma decoding in intraframes. I suspect this is because of some rounding errors and such. Original TM2 decoder used to store chroma in pairs – one 4 double word for 16-bit Cr and Cb and all calculations were done on those pairs.

For now, I’ll leave this decoder for some time and will post pieces of information about other codecs.

TM2 Format Part 2: Blocks

Friday, September 9th, 2005

TM2 frame is divided into blocks 4×4 and YUV420 colorspace is used, so for each 4×4 blocks of luminance there is two 2×2 blocks of chroma.
There are 7 types of blocks:

  • Hi-res block (hi-res luma and chroma)
  • Med-res block (hi-res luma and low-res chroma)
  • Low res block (low-res chroma and luma)
  • Null-res block (nothing changes from previous block)
  • Still block (just copied from previous frame)
  • Update block (copied from the same place in previous frame and then modified)
  • Motion block

Every intraframe block (first four types in the list) uses such logic:

LAST 4 ELEMENTS
| | | |
V V V V
D0 -> +d00 +d10 +d20 +d30 -> D0
D1 -> +d01 +d11 +d21 +d31 -> D1
D2 -> +d02 +d12 +d22 +d32 -> D2
D3 -> +d03 +d13 +d23 +d33 -> D2
| | | |
V V V V
LAST 4 ELEMENTS

It means that every block use last 4 pixels from top block as reference and deltas D (differences between right pixel in this line and previous line) from left neighbour.
In case of low-res chroma or luma last elements and deltas are modified before processing, they are replaced with their average values.
For interframe blocks there is one additional operation – to calculate new deltas and last values from block data after opying is done.

TM2 Format Part 1: Global Structure

Thursday, September 8th, 2005

TM2 differs from many modern and not modern codecs in aspect it uses different streams for different type of data – look at the scheme below:


[Global Header]
[Stream 0 - High-res chroma values]
[Stream 1 - Low-res chroma values]
[Stream 2 - High-res luminance values]
[Stream 3 - Low-res luminance values]
[Stream 4 - Update deltas values]
[Stream 5 - Motion vectors]
[Stream 6 - Block types]

Global header can be one of two types: old header type – all values are stored in 4-byte or 2-byte integers and new header type – header fields are stored in bitstream (9 bits for width and the same for height, 31 bits for flags, etc.).
One thing worth to mention – a mixture of big- and little-endian 4-byte integers in header values and use big-endian dwords in bitstream output.

Streams are huffman-coded array of values (called ‘tokens’) and consist of header, non-compulsory delta table (in this case tokens are just indices in that table), compressed Huffman tree and compressed tokens. Stream also can be empty (in case of intra-frames – that streams 4 and 5).

Decoder must decompress all these streams into memory and read tokens from one or another stream. Decoding process will be described in Part 2.