Kostya's Boring Codec World

Some Notes on Un-RE’d Codecs

June 23rd, 2012

If I haven’t REd a codec that doesn’t mean I haven’t looked at them at all.
So today I want to talk a bit about some un-REd codecs and what peculiarities they have.

Looks like that all interesting codecs can be divided into three groups: screen codecs, intermediate codecs and speech codecs.
Since I don’t understand the latter group I shan’t give details on it.

Screen codecs

We have lots of them and they can be divided into two categories: simple and monsters.
Simple codecs usually employ some standard data compression library (zlib, FastLZ, LZO or LZF) or Huffman coding with standard median prediction and interframe difference.
I.e. boring, let’s talk about monsters.

Windows Media Video 9 Screen (aka MSS2) — combines palettised regions coded like in its predecessor with WMV9-coded regions.
M$ Expressions Encoder Screen (aka Titanium Screen codec) — it uses variable-length codes and codes frames with one of two methods. One of them is DCT exactly the same as in M$ ATC Screen codec.
MSU Screen Lossless Codec — this one seems simply code R,G,B values with some arithmetic coder and lots of context modeling and prediction.
Go2Meeting codecs — a good demonstration of the fact that the best strategy against REing is employing shitty coding monkey.
Version 4 of decoder was monolithic 8 MB .dll file, version 4 is 15 MB already, all in “fine” C++.
There are two compression methods known.
Version 2 employs some weird arithmetic coder substitution (suspiciously like ELS-coder by Wm.D. Withers).
Version 3 employs libjpeg and zlib for coding image blocks somehow, frame data doesn’t look like it at all.

Intermediate codecs

Cineform — looks like they use Huffman coding and wavelets and it codes 10-bit video.

Fruit Intermediate Codec — looks a lot like its successor (ProRes) but with different bitstream format and fixed coding scheme instead of adaptive ones.

BitJazz SheerVideo — the main problem with it is that most of the codec code performs conversion between any of couple dozens of formats (8- and 10-bit YUV and RGB packed in any possible way). Actual decompression code gets lost somewhere.

Posted in Audio, Screen Codecs, Useless Rants | 2 Comments »

On reverse-engineering codecs

June 9th, 2012

Sometimes I’m asked what codec I’ll do next. So here’s the answer: I don’t really know.

Since I’m no longer student I don’t have much time to dedicate to reverse-engineering.
Thus I work on different codecs time from time and select such codecs more or less in random.
When I see that it’s nearly completed I spend more time on it till I write a decoder.
That’s how I got RALF decoder, for example.
Or that’s how I have not got Discworld III BMV decoder: long time ago I’ve figured out container format and audio compression, a month ago I’ve advanced in video decoder; but it’s far from completion and I don’t know when I’ll work on it again.

Of course some factors affect my selection of codecs: if I have some interest in it, if it serves some theoretical purpose (e.g. I did Indeo Audio and RALF not because I needed them but to have respective families of codecs fully supported) or if somebody convinces me to do it (two words – GSoC 2007).

So if you ask for some decoder you might get it after a while (but no obligations unless you send me a box of Trocadero).
Don’t forget that samples should be present too (and decoder for the more complex formats).

Posted in Useless Rants | 3 Comments »

Some Notes on Indeo Audio (samples needed BTW)

June 1st, 2012

I’ve been working on this codec for a while and somewhat got it working.

Good news — it employs the same algorithms as its predecessor, except that it has stereo mode.

Bad news — it feeds slightly different values to those algorithms. So some tables used in calculations and number of free bits in the block (for allocation) differ. I’ve almost got it and hacked version of our IMC decoder outputs almost perfect sound. My suspicions are that it modifies original IMC tables for stereo mode case (since it codes audio in mid-side stereo mode it makes sense).

The problem is that there’s only one sample with this codec and it’s extremely short. So if someone has more files with Indeo Audio please provide them to us.

Posted in Audio | 3 Comments »

A Dream Come True

May 12th, 2012

For one of my friends – Lost Eden finally running on Amiga.
(screen grabber output only HAM8 CDXL thus screenshot quality is not the best, click it for fullsize version).

Though someone should write HNM4 decoder one day…

Posted in Useless Rants | Comments Closed

High Priority Libav Projects

April 22nd, 2012

Once I’ve stumbled upon High Priority Free Software Projects at FSF. The idea appealed to me so here I present similar thing for Libav. It also has one or two sane proposals (hopefully) and offers the same level of support (i.e. none). But maybe in some cases we or I can help with it.

Working avserver;
Proper filter system. When I say “proper” I mean the one that allows dynamic reconfiguring, handles errors and works for arbitrary inputs and outputs;
libswscale replacement. The one that doesn’t sap sanity when you look at its code. Maybe with a nicer API too. And better pixel format support.

RealMedia support

Improve RM demuxer or maybe rewrite it from scratch;
Add proper support for multirate RM streams;
Add IVR format demuxer;
Add ClearVideo decoder (that’s the last codec in RM that we don’t support, hopefully not for long).

Other Intel codecs support

Improve Indeo4 decoder (it still has some features lacking);
~~Add~~Improve Intel Audio Coder decoder.

On2 codecs support

On2 VP7 decoder (we still can implement it faster than certain Baidu rival releases its source code);
On2 VP4 decoder;
On2 AVC decoder (that stands for “Audio for Video Codec”).

Too bad I cannot even find a decoder for On2 AVC nowadays. We have some samples though.

Micro$oft (screen) codecs support

This company has at least four screen codecs that we don’t support (~~MSA1,~~ MSS2, ~~MTS2~~ and CGDI).

~~Add M$ Screen Codec 1 decoder;~~
~~Add M$ Screen Codec 2 decoder;~~
~~Add M$ Expression Encoder Screen Codec decoder;~~
Add beta Windows Media Video 9 interlaced decoding.
Fix beta Windows Media Video 9 P-frames decoding.

QuickTime codecs support

~~Add Rottenfruit Intermediate Codec decoder;~~
Add any other codec decoder.

Other codecs

Add GoToMeeting 2-4 decoder;
Add more screen codec decoders;
Add more game format decoders (especially Discworld Noir BMV);
Add more audio (especially speech) codec decoders.

Posted in Libav, Useless Rants | 7 Comments »

Codebook Hell

March 27th, 2012

There’s one codec I’d like to have reverse-engineered and implemented as an opensource decoder (well, lots of other codecs as well but this one particularly). Its name is VoxWare MetaSound, that’s an old codec which was used as an alternative to MP3 in old days of DiVX 3 😉 and its clones.

It’s definitely based on TwinVQ and is probably closer to the variant that got into MPEG-4 Audio standard (I suspect that mostly to make that standard even more bloated than before). That figures from having such modes like 8kHz/6kbps which is not present in VQF but present in ISO 14496-3 draft.

This codec probably has more data tables than TwinVQ (in binary decoder the section with codebooks is more than 256kB large, in TwinVQ it’s about 200kB) and should set a new record if we ever get a decoder for it.

Decoding looks very simple in theory: decoder initialises codebooks for given samplerate and bitrate (it’s actually signaled in extradata: VOXq for 44.1kHz/32kbps, VOXk for 16kHz/16kbps, VOXz for 44.1kHz/48kbps), for every frame it reads window type and an array of some values and performs reconstruction.

So far I was able to identify only some codebook information. Bark tables seems to be identical, but shape and whatever codebooks seem to be different.

I’ve spent a couple of evenings finding out that information and I dare someone (especially you, Vitor!) to write a decoder for it. I don’t know a thing about TwinVQ except one fact and it’s stated in the title.

Posted in Audio, Useless Rants | 1 Comment »

Call for Intel Codecs

March 19th, 2012

I’ve spent two weekends and finally REd and wrote decoder for Re* Audio Lossless Format. With news like these I can deliberately call it Intel Audio Lossless Format.

So, what codecs we’re lacking so far?

Intel Audio Coder — it’s quite similar to IMC (Music Coder) but not identical.
Intel Layered Video Codec — probably it’s just h.263 variant, the only thing I know is that RealVideo 2 decoder was based on it (it’s mentioned in doxygen for Helix SDK I saw once in Internet somewhere and this supports that theory indirectly).
ClearVideo — a licensed fractal-based codec. It’d be rather simple DCT-based codec if not for one catch: it uses domain search to generate codes that then are used for block unpacking (and in decoder too, it seems). Maybe these patents will help?
Intel NGV — we’ll deal with it when it’s ready 🙂

Feel free to send any useful information about them, preferably working decoders of course.

After that we can claim full support of Real and Intel codec family.

Posted in Audio, Libav, Lossless Audio, Useless Rants | 2 Comments »

A Few Words about my ProRes Encoder

March 19th, 2012

Some people wanted to have ProRes encoder in Libav so I wrote one. And from what I gather it even has one user (not me).

In case someone is interested here is the list of possible options:

profile — selects ProRes profile to encode (proxy, lt, standard or hq)
quant_mat — selects quantisation matrix from one of profiles (proxy, lt, standard or hq). If you don’t specify it, the matrix will be picked from default profile (or use auto to be really sure). There’s also default matrix which should give the highest quality (it’s default in the sense that when quantisation matrix is not provided in frame decoder defaults to this one).
bits_per_mb — how many bits to give for coding one macroblock, different profiles use from 200 bits per macroblock to 2400, one can set it up to 8000.
mbs_per_slice — how many macroblocks are there in slice, 1-8. Default value of eight should be good for almost all situations though.
vendor — one can put custom vendor ID into frame like apl0 to claim it was produced by Apple encoder.
qscale — set fixed quantiser

How to make it encode faster?

In default mode of operation encoder has to honour frame constraints (i.e. not producing frames with size bigger than defined) while still making output picture as good as possible.
If the frame contains lots of small details it’s harder to compress it and encoder spends more time in search for appropriate quantisers for each slice. Thus setting higher bits_per_mb limit will improve the speed.

Or if you don’t care about frame size constraints just set qscale parameter to something (I’d recommend 4) and see it encode MUCH faster.

Feel free to leave wishes for features in comments, hopefully I can implement it when I have time.

P.S. For proper 4444 profile support we need 10-bit YUV with alpha. When it’s in I can add that profile too.

Posted in Uncategorized | 23 Comments »

A bit more about cooking

November 12th, 2011

Sometimes when I have an acute nostalgy I try to cook something from my homeland.

First time I made
Köttbullar (med potatis och lingonsylt självfallet). Too bad I could not do it SWEDISH STYLE! :-(. This time I tried to make Janssons frestelse. Jag hade inget burk ansjovis men bohusmatjessill i stället blev lagom bra.

Posted in #chemicalexperiments | 1 Comment »

A Codec Cookbook

November 12th, 2011

With the addition of VBLE decoder I thought once again about codecs and how they are written.

Lossless Video Codecs

There are two approaches:

Take a frame, apply one or two general compression schemes to it. Can be zlib, RLE+zlib or motion compensation from previous frame + zlib.
Discover spatial prediction (usually from left neighbour or median) and add some coding for residues. HuffYUV, Lagarith, UtVideo, VBLE, LOCO, FFV1, whatever.

Lots of people try it, find that their codec is faster/compresses better than HuffYUV and release results. Usually those codecs don’t live long and the only bad thing about it is they being released to public in the first place.

Lossy Video Codecs

The codecs are usually more complex, so there are less of them. But there are more ways to create one.

lossily quantise raw data or DCT output Every self-respecting company producing frame grabbing cards has written such codec.
take a draft of some standard codec and base your work on it That’s how we got Window$ Media, R3al and Off2 video codecs.
WAVELETS!!!!111oneone
another approach to compression like vector quantisation, binary or quad tree decomposition, object-oriented representation (though this one is mostly used in screen capturing codecs), etc.

The main problem with these codecs is achieving good compression parameters without much hassle. For example, libavcodec MPEG-4 encoder may be the best around here but (like Soviet machinery) one has to work real hard to find out which parameters he/she needs to set to which values to get good compression. That’s the reason why people often choose Xvid instead.

Lossless Audio Codecs

There is one approach to those: add lots of crazy filtering (usually several chained filters) and equally crazy coding of residues. There you got it. Simple filters = faster compression, complex filters = slightly better compression with significantly longer compression times.

Last abstract from lossless video codecs applies to audio as well.

Lossy Audio Codecs

Those appear not too often because it’s very hard to satisfy everybody’s ears. Thus (IMO) it’s mostly limited to speech codec development. And there’s Xiph of course.

Posted in Useless Rants | 1 Comment »