Another “reverse engineer codec before breakfast” report

May 8th, 2013

So, as it happens to me sometimes, I’ve looked at some random codec before fixing myself a breakfast.

What do we have here?

  • 8×8 DCT
  • lots of static VLCs (not as much as Real codecs have though)
  • several coding modes — 4:2:2, 4:4:4, both with alpha or not
  • block-level interlacing

Macroblock consists of 4 luma blocks, 4 or 8 chroma blocks and optionally 4 alpha blocks. In the latter case not all blocks might be coded and CBP is present.

There are separate VLC tables for DC depending on bit depth (10, 11 or 12 bits per component) and for quantised ACs (for quantisers 0-7, 8-15, 16-31, 32-63, 64-127 and the rest).

The only interesting scheme is quantiser selection: there is a table with predefined quantiser quadruplets. Every macroblock can select any quadruplet and every block can use any quantiser from it.

Oh, and I think codec’s name was Canopus HQX.

Some notes about AVC

April 13th, 2013

Of course you’ve heard about AVC, the famous audio codec from On2 (and not the codec used as a base for VP7 and VP8). Out of curiosity I tried to look at it and found out that it was as creative as video codecs.

So here’s how it looks:

  • It is 1024-point MDCT based codec.
  • It codes samples either in one window or in eight 128-sample windows. There are four modes because of that — long window, short windows and two transitional modes.
  • Windows are organised into bands, 49 bands for the long window, 12 bands for each short window (band widths are the same for all sampling frequencies though).
  • Frame data consists of: windowing mode, grouping information for short windows, some band flags, run-coded information about used codebooks, scales, coefficients.
  • Scales are stored this way: first scale is explicitly coded 7-bit value, other scales are coded as differences and restored as prev_scale + scale_diff - 60.
  • Codebook 0 means no coefficients codec, codebooks 1-8 code quartets of coefficients, codebooks 9-15 code pairs of coefficients, additionally codebook 15 allows escape values to be coded.
  • Unscaling is performed as *dst++ = val * sqrt(val) * scale.

Do not blame me if that reminds you of some other codec. Bitstream format is different after all. And the second letter of the codec name too.

OMG

March 23rd, 2013

IMG_1466

I think this should be in every first aid kit.

On some screen codecs

March 5th, 2013

Recently my attention was drawn to two screen codecs. So I’ve looked a bit closer at them (I’m not going to implement them — at least now, simian audio codec is waiting) and here are the most interesting bits to me.

Dxtory codec. I don’t know if it uses prediction or not but the coding is extremely simple: read unary code (1-8 bits), if the code is 8 (escape) then read byte for a new value, otherwise retrieve value from LRU list and of course move/put the new value to the beginning of that list.
The organisation into slices and decoding for different colourspaces is trivial.

Mirillis FIC Video. This one has more complexity: it operates on 8×8 blocks. Block is coded in the following way: if the flag is present then the block is skipped, otherwise read 7 bits for the number of coefficients to decode and the coefficients (signed exp-Golomb coded). Zigzag scan, IDCT, output.

Boring.

A new record

March 4th, 2013

I thought that there’s no codec more horrible than Go2Despair and was proved wrong again. Here’s the even worse pair: Dxtory codec and its standalone companion PackBit.

DxtoryCodec.dll is 8.2MB which rivals Go2C++Hell because it’s standalone decoder without any additional code. PackBitCodec.dll is even more elegant. Of the total size of 672256 bytes more than 357000 bytes are occupied by the code of one function (that’s 3-6 times more than the ordinary zlib-based codec complete DLL file). Truly some people should not be allowed to program.

More about Monkey’s Audio filter changes

March 3rd, 2013

In the previous post I gave general overview of codec changes, now I’m going to look more deeply at the filter changes with time.

  • 3950 — current mode with up to three layers of IIR filters
  • 3930 — simpler filters: no third layer (there was no insane compression level back then) and the difference between predicted and actual value was not used.

For the older versions there are differences in the implementations of the filters for the different compression modes.

Fast compression:

  • 3200 — order 2 adaptive prediction (i.e. previously decoded and adjustable prediction value are used in prediction)
  • 0000 — almost the same but with different rules for adjustment factor updating

Normal compression:

  • 3800 — two layers of filters: order 4 adaptive prediction and order 2 afterwards
  • 3200 — the same structure, different rules for updating
  • 0000 — three layers with orders 3, 2 and 1 and different updating rules

High compression:

  • 3700 — first it tries first order adaptive prediction with the delay of 2-16 (i.e. the next to previous element is used for prediction) and normal mode decompression afterwards (different decoding for 3800 of course)
  • 3600 — the same but delays are 2-13
  • 3200 — the same but delays are 2-7
  • 0000 — orders 5 and 4 and different updating rules

Extra high compression:

  • 3830 — an IIR filter resembling the one used in the newer Monkey’s Audio versions
  • 3800 — some filter parameters were half as much as in 3830 and there was no delay 2-8 filtering
  • 3600 — delay filtering plus high filtering (which is delay filtering plus normal filtering, which can be expressed as a layer of filtering over fast filtering)
  • 0000 — essentially the same but with different prefiltering

Monkey’s Audio: noted differenced between versions

February 28th, 2013

While preparing for working on old APE versions support I finally got courage to try and trace all changes for different versions. So here’s the list of internal versions and the changes they introduced:

  • 0000 — the reference version for all prehistoric version. Before version 0000 it was fine, then it all got worse IMO.
  • 3320 — changes in the filters
  • 3600 — changes in the filters
  • 3700 — changes in the filters
  • 3800 — blocks per frame changed for extra high compression level; changes in the filters (yawn)
  • 3810 — frame start at byte boundaries now
  • 3820 — special codes extension (signalled by top bit of CRC set to one)
  • 3830 — filter lengths and some implementation details changed
  • 3840 — CRC calculation algorithm changed a bit
  • 3870 — significant changes in residue coding
  • 3890 — small changes in residue coding
  • 3900 — residue coding format has changed seriously.
  • 3910 — small change in the residue coding (more than 16 bit values can be coded now)
  • 3930 — significantly changed format introduced (both filtering and coding scheme were changed)
  • 3950 — filter format changed a bit (and insane compression mode is added somewhere after that), blocks per frame is changed too
  • 3960 — some small and compatible change in the bitstream (consuming two last bytes or not)
  • 3980 — file format is changed a bit; filtering process has changed a bit too.
  • 3990 — the latest (known) format. Residue coding has changed.

Do you still wonder why I strongly dislike this format?

Preserving extinct formats

February 27th, 2013

By the request of one guy (he has provided samples as well) I shall work on supporting old Monkey’s Audio versions (before 3.95).

Why? Because the latest official version of Monkey’s Audio has dropped support for those files, because I wanted to support such files since really long time (just didn’t have a good opportunity to do that) and because I definitely need a distraction from Go2Insanity codec (I shan’t blog about it anymore).

Well, let’s see what the old versions of the worst (known) designed lossless audio codec have to offer me.

Now (hopefully) the last post about Go2Disassembly details

February 18th, 2013

I’ll update codec details on our wiki and hopefully can forget about it. Let somebody French-speaking complete it.

There are many codecs out there to reverse engineer, including worthier ones. On the second thought that would define almost every codec.

Update: filled all information I had, the rest is up to somebody with the debugger and motivation to do it.

Another piece of digital archeology news

February 14th, 2013

After investigating the smallest available pile of fossilised dung called Go2Cesspit (only 2.5MB instead of 15MB for the current version) the G2M1 can be reconstructed. It had the same tiled structure as its successors but it coded all tiles with ELS only, no combining with JPEG data.

And the full history of Go2UnwantedPlaces evolution (there definitely was no intelligent design for obvious reasons):

  1. Citrix licenses image coding from Accusoft (ELS-based), uses it to release G2M1.
  2. They want to improve compression and try to code some blocks in the different way. JPEG to the rescue! That’s how G2M2 was born (compression method 2).
  3. G2M3 looks like marketing bump since image coding has not been changed.
  4. For some reasons they replace ELS part with simple deflated raw bitmap. That’s G2M4 now (and compression method 3).

Further findings may correct this system of course but so far it looks like this.

P.S. If you want to have G2M1 supported then send samples and support requests to VideoLAN. They will be grateful for sure.