A few words about G2M4 (that were not censored)

November 4th, 2012

Okay, I looked into G2M4 closer, here’s the output:

As with the previous beast, there are two types of images combined in single tile — so-called synthetic layer and natural layer. What you see if the first layer decoded.

Here’s the general tile structure:

  • Compression subtype (top bits from the first byte).
  • Transparency colour (three bytes)
  • Number of palette entries minus one (one byte)
  • Palette entries (byte triplets)
  • Synthetic layer (16-bit BE chunk size plus deflated data, may be not present)
  • Natural layer (probably headerless JPEG data, too lazy to verify)

Synthetic layer image is (after decompression) contains packed bitmap that uses palette from above, each row is coded as 8-bit flag [packed row data]. If the flag is zero then row data is present (that’s my guess, it always seems to be zero). Row data is just palette indices stored as 1/2/4 or 8 bits per index depending on palette size. Sample output you can see above.

Feel free to complete RE.

FnAQ about G2M2/G2M3

November 3rd, 2012

Just to clarify status: I’m not working on this anymore so anyone can pick it up and finish.

And here are some possible questions that might be asked but more probably won’t.

Q: who cares about this codec anyway?

A: Not me. VLC does.

Q: so why don’t you do it?

A: there are several reasons. First, now I have idea how it works and it’s not that interesting anymore. Second, it would require some debugging and I cannot run that decoder under MPlayer2 (and I don’t use Windows at all).

Q: but wait, there’s G2M4!

A: right, and it uses completely different coding. I might look at it but no promises either.

Q: so, how does it work?

A: the idea is simple. Every frame is divided into tiles and some tiles can be updated from the previous frame or not; some additional information (i.e. mouse cursor shape and position) is also stored in the frame.
G2M2/G2M3 use the technology licensed from Accusoft that combines JPEG and ELS-coded image.

Q: how do they do it?

A: the approach (I call it JPEG-Binary Koder or J-BK for short) is quite simple. Every tile has ELS-coded picture with possible transparency. Transparent areas should be replaced with headerless JPEG data (i.e. only scan data without any markers but with escapes).

Q: sounds easy, where’s the catch?

A: I’m too lazy to catch bugs in my quick JPEG decoder reimplementation and ELS-coded image requires some debugging which I can’t do.

Q: okay, I want to do it, where shall I start?

A: is it the first of April? No? Hmm… Okay, here’s what I would do: grab a copy of g2m.dll (there are enough of them around, in various sizes too), disassemble it.
Find the ELS thresholds table (referenced values are 0x10000, 0x12A00, 0x15C00, 0x19600, 0x1DA00, 0x22800, 0x28500, ...) — the function referencing it is the one used to update ELS coder state, go up from there. Feel free to look at the wiki entry about G2M. Bonne chance!

Teasing

October 6th, 2012

In the recent month I was not very productive, so I’d like to talk about codecs that I’m not likely to finish soon (not that I’m going to finish any codec soon anyway).

GoToMeeting 2/3

G2M2 decoder output (the best I could get)

Here’s the best output I could get from G2M2 or G2M3 data by decoding JPEG part of the tiles. ELS part still needs some work since it’s boring — 10-neighbour prediction, differential pixel decoding and other wonders of binary coder.

Certain Intermediate Codec

I managed to reverse engineer some parts of it. First you have so-called fixed header, then you have strip sizes and then strip data with some header as well. The way it’s coded is also more or less clear. But some connecting details — like how those strips are divided (now it looks like 96×1 macroblocks or equally ridiculous).

Since it’s QuickTime it’s hard to say where are the entry points to the codec and what functions are invoked.
Also the only usable binary (with debug symbols) is PowerPC only. It’s nice platform but I still need to learn some of its peculiarities.

VoxWare MetaSound

It turns out that it is slightly simplified variant of TwinVQ. It does not have variable-length codes, all values are read as fixed bits (depending on sampling rate and bitrate of course). The only catch is that it’s hard to find where such description is retrieved or generated. And existing codebooks are somewhat different.

On buying tickets

August 12th, 2012

I like to travel around, usually by railway — it’s the most comfortable means of transportation (unless you’re talking about Ukrainian trains or French TGV). And the fastest one for short to long distances (planes are technically faster but consider time getting to the airport, from the airport, security checks…).

So to travel around you need to buy tickets (I’m not made of money to afford some magic “travel anywhere, pay monthly through nose” card) and that’s what I’m complaining about. My requirements are rather easy: you should not need to interact with people and you should be able to see what are the possibilities for the travel.

France. Le facepaume. Ticket vending machines there reflect national spirit frighteningly perfect.

Ticket vending machines.

That picture was taken in one French town near the border. There are two vending machines. To the right is French one. Here’s a short comparison with its neighbour:

  • Languages — half a dozen for one, French for another.
  • Controls — touch screen for one, weird knob with a button in centre (and you cannot change audio tracks with it).
  • Destinations — countrywide and beyond in one case, one region in the other case.

To be fair there are SNCF ticket vending machines that should offer countrywide destinations and I heard you can even get e-ticket from them and they even support other languages than French (which is the hardest to believe). The only problem that you should handle them politely (i.e. point and stick you finger as hard as possible) and I rather value my fingers.

Germany. Three years ago it was a bit quirky but they’ve upgraded vending machines software and now it’s almost perfect. Half a dozen of possible languages, rather intuitive interface, some additional features. And you can buy a ticket to the destinations in neighboring civilised countries (Switzerland and Netherlands) not served by Deutsche Bahn directly. The main WTF is that sometimes you see the trains but you cannot buy a ticket for them there (probably some special trains?).

Netherlands. I’ve used it only once but I remember it being pretty decent.

Sweden. Pretty decent ticket vending machines, I like the additional features like printing bought e-ticket (and you can get it in many other places too). Also I like the fact they have both touch screen and real keyboard and trackball. The only downsides are that it’s a bit slow and that it does not deal with cash and accepts only cards. Sweden is an advanced country after all, you can pay with card almost everywhere but it sucks to be foreigner from a mostly cash-based country (i.e. me few years ago).

Switzerland. Simply WTFiest ticket vending machines. They might be the reason most people buy rail pass instead. You can buy a ticket but it will ask you which route you prefer and it’s not optional. But it’s compensated by the fact it does not offer you any information on trains. You bought your ticket from A to B via C, now go and find what train goes that way in some other place.

Ukraine. In my opinion it should just give up that automated system for ticket sales (made in Soviet times) and cashier should write tickets by hand. Then it will be perfect stone age. I heard there are some advancements in that area: you can now buy e-ticket (but you still need to go to the ticket office where they print it out) and there are talks about removing the requirement to show your ID during ticket purchase. And if you don’t speak Russian you’d better not try buying tickets at all.

Some details on ClearVideo

August 11th, 2012

ClearVideo was the most widespread codec in the old days. One cannot name any other codec that was present in AVI, MOV and RealMedia simultaneously. Oh, and it presumably uses fractals.

Recently I’ve discovered one rather funny thing: ClearVideo intraframes are very simply coded. You have standard 16×16 macroblocks, some DCT, one static codeset for DC, one static codeset for AC. It’s simpler than baseline JPEG (well, maybe except for the fact there’s a set of flags signalling if ACs are present in the block at all).

My main problem with it was that I could not find out where are those codes are stored or generated. Well, it turns out that it’s stored in binary form in wrapper DLLs in resources section (so if you use some resource explorer on it you can find the codes in resource TABLE/DCT or modify RAW/*BRAND to remove that annoying watermark but who cares?).

Maybe one day I’ll deal with interframes and RealMedia demuxer support…

An uninteresting decoder patch (contains G2M)

August 5th, 2012

I’ve stumbled upon decoder for Go2Meeting, I don’t remember the link but I’ve made a copy here.

Since it’s obviously for FFmpeg I wonder if it’s made it to their repository yet.

A followup on Go2Webcrap codec details

July 20th, 2012

I didn’t have much interest in this codec to start with. So I’ll just leave it here.

I suspect that both newer and older versions of the codec divide it into sharp details overlay over smooth image a bit like DjVu format does.

For compression method 2 (G2M2/G2M3) sharp details and filled regions are coded with ELS (weird binary coder). There’s also a special transparent pixel value of course. Smooth image seems to JPEG scan data without any other headers. They just use standard codes and default quantisation matrix from libjpeg6b with default quality setting 75.
I’d name this scheme JPEG-Binary Coder or J-B_K for short.

Here’s an example of something I hacked this morning to somehow decode baseline JPEG:
Would you be able to recognize any details here? I guess not.

And just think how wonderful to decode RGB mask and JPEG data in YUV format and combine them in the same image.

Compression scheme 3 talks about natural and synthetic image layers. It uses zlib to code some chunks inside tile data. And something else in addition to it.

Here’s an example of data there:

I remind that everybody is welcome to RE this codec (and I have more interesting codecs to deal with; I’m not VLC after all).

On Monster Codec

July 15th, 2012

There is one codec unrivaled in its monstrosity. I’m talking about GoToMeetingAndNeverReturn.

First of all, look at its size. The oldest version I could find (it decodes only G2M2 and G2M3) occupies 6196600 bytes. Newer one with G2M4 support (and that’s an additional compressor too) is 8247160 bytes, Current version (with seemingly the same functionality) is 15665528 bytes.

What does it contain? Everything. I kid you not, All (maybe except one) GoToUnpleasantPlace utilities refer this small file and actually export their real main() function out from it. And their size is about forty kilobytes each.

What does it contain from code point of view? Some libraries for networking, cryptography, speech codecs library, libjpeg, zlib, some internal code and tons of C++-induced crap. When you see wrapper calling wrapper calling wrapper calling something trivial (and the wrappers just pass arguments as is and do nothing else), or when you see that 90% of any function is used for exception handling, then we’re talking about enterprise-grade C++ indeed. And of course absolute bloatedness (including making indirect calls where unneccessary) and inability to run on Linux (not with Wine or MPlayer2 loader) greatly help debugging.

Now to the codec details: every frame consists of chunks which hold different kind of data — frame information, mouse cursor shape, mouse cursor position, image data, etc. Frame is divided into tiles (usually 8×8 tiles in frame) and each tile is coded separately.

There are two compression methods, known by their numbers stored in frame header. For G2M2 and G2M3 it’s compression method 2, for G2M4 it’s compression method 3. Both are horrible.

Method 2 seems to have some completely unrelated submethods.

Here’s the call graph from method 2 decompression function.

Looks like there are several possibilities when coding with this method.

There is a possibility to use JPEG compression but I don’t know under what circumstances it’s used.
Also I’ve discovered that MSA1(MSS3) and MTS2(MSS4) actually use standard recommended quantisation tables from ITU T.81 Appendix K.1 (they are not used by libavcodec JPEG decoder though) and quality to quantisation mapping from libjpeg6.

Another coding method employs ELS codes I’ve described in the previous post and uses it to code some bit plane and RGB triplet in dependent form (i.e. with prediction from two previous pixels and using R-G, G, B-G components).

For method 3 so far I know only a few facts. First byte contains image type and depending on it decoding may vary a bit. For example, for image type 0 and 3 three following bytes contain some RGB value, other image types don’t have it. And internally it’s called MPCCoder or SMPCCoder.

In conclusion I want to say that this codec is too large and horrible to use only static reverse-engineering (because it’s very hard even to determine what is the next function called indirectly) and debugging it requires Windows, so don’t expect me to RE it. But if you do you’ll get many thanks and some money from these guys. Good luck!

About some weird coding methods in various codecs

July 13th, 2012

Today I (not really) want to talk about some weird coding methods employed by two codecs.

Monstrous Go2Meeting. To my very deep surprise its compression method 2 uses ELS coder, a curious binary arithmetic coder replacement. In essence it operates on fraction of bits (called jots by its autor) and uses something like state machine for model (i.e. depending on state and decoded jot value — 0 or 1 — move to one of two possible other states and subtract some state-defined value from input value). This implementation uses 36 jots per byte, has ladder with 174 rungs and operates on 24-bit state instead of 16-bit in the paper.

From a cursory glance on TAK it seems to be more or less ordinary lossless audio codec — i.e. LPC plus residue coding. The only peculiar thing is that residue coding. While other codecs use mostly adaptive coding, this one seems to employ fixed coding parameters for segments of residues and bitstream also contain parameter set indices for all these segments.

Scheme is rather simple: read predefined number of bits, if it’s not the escape value then reinterpret code as signed. For escape value get unary code, if it’s not equal to the secondary escape value then scale that value and reinterpret as signed. Else just read some additional number of bits, scale them and reinterpret as signed. Number of bits to read and escape values make those coding parameters (about 52 total).

Also even if it might provoke small flame war, I publicly say that I’d rather not see TAK supported in opensource. We have enough lossless codecs already, especially with their own containers. And they cover all possible uses already (don’t tell me about insignificantly higher compression ratio).

On Some Screen Codecs, including TSCC2

July 5th, 2012

Two days ago I’ve heard about the release of new screen codec from TechSmith. Since TSCC was the first codec I’ve REd (even without looking at decoder) I could not resist and looked at it.

This codec is completely different from its predecessor. While TSCC was simple deflate+RLE, this one operates on blocks. It splits frame into slices 16×8, which are split into 4×4 blocks in YUV 4:4:4 format and 16-240 range. And those blocks are coded with DCT-like transform and quantised with one of two possible quantisers. Pretty easy, isn’t it? Oh, and internally codec is called “Dora”.

And looks like DCT is what some more advanced screen codecs use (and they are usually not lossless anymore).

For example, Micro$oft has about five screen codecs (maybe more):

  1. M$ Camcorder Video (CGDI) — somebody had rather stupid idea “ooh, let’s simply record GDI events and store them into frame”. Including commands for drawing text (and damned be you if your system fonts differ) and such. That’s the reason why we never get a decoder for it.
  2. M$ Screen 1 (WMV 7 Screen, MSS1) — palettised codec that employed classical arithmetic coding and modelling and coded frame by recursive sub-partitioning and pixel prediction for image areas.
  3. M$ Screen 2 (WMV 9 Screen, MSS2) — hybrid codec with enhanced coding method from Screen Codec 1, 16-bit RLE and coding some image areas with WMV 9. (And it was not REd by me BTW).
  4. M$ ATC Screen (MSA1) — internally it’s still known as MSS3. This codec operates in YUV 4:2:0 and uses range coder and modelling for coding macroblocks as solid fill, vector-quantised image, Haar wavelet or DCT.
  5. M$ Expression.Encoder Screen (aka Titanium Screen, MTS2) — this codec uses variable-length codes and only two coding methods left: DCT (exactly as in MSA1) and vector quantising which looks like a mix of MSA1 and MSS1.

I really should end it and move to something else…