Kostya's Boring Codec World

FFmpeg: providing better alternative since 2000

September 4th, 2009

Few days ago FFmpeg finally got WMA3 decoder. This event gives me an opportunity to look at our achievements.

Popular and/or standard codecs — supported except for the newest stuff (AAC-HE[2], H.264 interlaced modes, VC-1 interlaced modes).
Windows Media — WMV1-WMV3 are supported (except for beta version of WMV3 and other WMV3 spinoffs). WMA1-WMA3 is supported too. We still have WMA Lossless and WMA Voice to RE and our top men are working on it (did you remember “Raiders of the Ark” ending? Neither did I).
Real Media — RV1-RV4 are supported, from the variety of audio codecs only Sipro and Real Lossless support are missing. Sipro is in the works and nobody (including RealNetworks itself) cares about RALF.
Intel codecs — Indeo 1-3 is supported, patch for Indeo 4-5 is available, IMC is supported, IAC is not REd (and not in queue).
RAD codecs — REd, there are still some issues with Bink to sort out before inclusion.
AVI codecs — that’s a mess. There are simply too many very codecs and new ones still continue to appear. Some are supported, most are not.
Lossless audio codecs — some are supported, some are not. Again, looks like everybody writes own lossless audio or video codec. I’d like to get support for TAK though.
Game video codecs — we still have a lot of them to RE. Personally I want Discworld III video (BMV, but it differs from the format used in Discworld II) support. *sigh*

If you think there’s some codec we definitely should support, please tell us (preferably with specification or decoder sources 😉 If you just want to have some codec support in FFmpeg — make us interested in it, some codecs support appeared in FFmpeg after somebody had said “can play that file?”.

Posted in CEmpeg | 17 Comments »

Bink: pattern-run blocks

September 4th, 2009

And now for something completely the same.

Let’s talk about most interesting block type in Bink. I don’t know official name for it but I call it pattern-run block because of the way it’s coded. Idea is simple: there are runs of single colour and blocks of different colours like in your ordinary RLE; what can be interesting in that? But there is one thing — block is filled with runs/copies not in usual scan orders but following one of 16 predefined patterns – columns, spirals, Hilbert curve (Zelda pattern for some of us), whatever.

Here’s an example:
Scan pattern #13
(and SVG version)

I think it’s obvious how this helps block compression. The only bad thing about it is the fact it did not appear in Smacker (mostly because Smacker uses 4×4 blocks).

This concludes my series of posts about Bink.
“Works for me” patch against FFmpeg r19754 is located here.

Posted in Bink, Game Video | 3 Comments »

Bink: a bunch of peculiarities

September 3rd, 2009

I’ve mentioned before that Bink differs greatly from other codecs. Now I want to walk over general structure of it and mark all peculiarities I’ve seen so far.

Huffman coding. I think I’ve mentioned it enough times.
Data coding. The fact that different values (block types, colours, run values) are coded in so-called bundles (i.e. groups) for at least one row of blocks at once. So when starting decoding new row bundles are checked whether there’s enough data and more is decoded if needed.
16×16 and 8×8 block mix. Sometimes encoder inserts 16×16 block into usual array of 8×8 blocks. Looks like those blocks can happen only on even positions which eases skipping decoded part of it. 16×16 block contents are actually 8×8 block contents scaled twice.
Coding modes. There are 10 block types; three of them belongs to vector quantisation techniques (I’ll write another post about special run-length pattern block), two block types use DCT (more below) and another block type uses special coding for residue without any additional transform.
DCT coefficients coding. I’ve written a bit about it already. Have I mentioned they also use non-standard scan order (designed for pairs of coefficients)?
Coefficients quantising. There are 16 possible quantisers – 1, 1 1/3, 1 2/3, 2, 2 2/3, 3 1/2, 4, 5, 6, 8, 12, 17, 22, 28, 34 and 44.

I suspect that some of the things are legacy of Smacker and really clean design would go in slightly other direction – it’s not pure vector quantisation as it was but it’s not pure DCT-based codec either.

As for the progress: I have more or less working decoder in my own build of FFmpeg. When somebody kicks certain devs to push Bink demuxer and audio decoder into SVN codebase, I’ll give my decoder with that. Until then just wait.

Posted in Bink, Game Video | Comments Closed

Bink: ‘lossless’ block coding

September 2nd, 2009

First of all, I’d like to note that those names are taken from Bink code. In reality ‘lossy’ block is used as is and ‘lossless’ block is DCT coefficients.

And now, the differences:

in ‘lossless’ mode coefficients are decoded until mask becomes zero, there’s no explicit number of coefficients
coefficient bits are stored explicitly, not as several masks: coef[x] = mask | get_bits(log2(mask));
starting list somewhat differs

For those who for some unknown reason are interested in RE progress, I can say that my implementation is still far away from perfect. It crashes on 640×480 BIKi files and for those two files it plays (BIKf and BIKi) it gives barely recognisable image — I blame DCT and dequantisation (I haven’t looked at them yet).

Posted in Bink, Game Video | 21 Comments »

Bink: ‘lossy’ coefficients reading.

August 29th, 2009

RTMP client seems to work fine, RTMP support in FFserver is not that close, so I work on REing some codec which seems to be rather widespread in games.

OK, now to technical details. ‘Lossless’ coefficient coding is similar but a bit more complicated.

For each 8×8 block there is 7-bit number specifying number of masks to read (mask = part of the coefficient), slightly resembling progressive JPEG coding; coefficient value may be composed from several masks, high bits are decoded first. Decoding continues until all masks are read.

Coding method is not that comprehensible though: there is list of start coefficient and modes, so decoding iterates over this list and performs some action depending on mode. Have I mentioned that aforementioned list may change during operation?

And here’s decoding algorithm (if I got it right):

mask = 1 < < get_bits(3) iterate over already decoded coefficients, if read bit = 1 then add mask to the coefficient iterate over list of modes until end is reached, if (coef,mode)==(0,0) or read bit = 0 then skip current entry: mode = 0: set current entry to (cur_coef+4; mode = 1) for(i=0;i<4;i++, cur_coef++){ if(get_bit()) prepend list with (cur_coef, mode = 3) else coeffs[cur_coef] = get_bit() ? -mask : mask; } mode = 1: set current entry to (cur_coef; mode = 2) append (cur_coef+4; mode = 2), (cur_coef+8; mode = 2), (cur_coef+12; mode = 2) to the list mode = 2: set current entry to (0; mode = 0) for(i=0;i<4;i++, cur_coef++){ if(get_bit()) prepend list with (cur_coef, mode = 3) else coeffs[cur_coef] = get_bit() ? -mask : mask; } mode = 3: coeffs[cur_coef] = get_bit() ? -mask : mask;

Posted in Bink, Game Video | 3 Comments »

Brief notes about Bink

July 13th, 2009

If you play a lot of games (or maybe not that much) andd are interested in watching their FMVs you should hear about Bink sooner or later.

This is rather widespread codec in games and it’s sad we still don’t have an opensource decoder for it.

Here are some facts about it:

Container format seems to inherit a lot from Smacker.
There are two different audio codecs differing by transform.
Bink video is mostly static Huffman coding + vector quantisation or DCT

So, why not reimplement it?
Here are some more details about video:

It employs static Huffman coding – there are 16 predefined trees which are used to decode a lot of data — the only exceptions are block coefficients. Tree definitions include only tree number and how to reorder table of symbols for current data.
Almost all values are coded in bundles for several blocks at once (usually for half of a frame)
8-bit values may be encoded as independent nibbles or high nibble may have context-dependent encoding when it’s encoded with a tree number equal to the last high nibble (so you need 16+1 trees for that but who cares).

The rest will be available as it goes.

Posted in Bink, Game Video | Comments Closed

A bit on Interplay MVE 16-bit

April 29th, 2009

For those, who are interested in playing 16-bit MVE files (yes, Mike, I am talking about you) here are some bits of information I’ve gathered at my leisure:

you have to skip 16 bytes from block map at the beginning instead of 14 for 8-bit MVE
colours are now stored as 15-bit (obvious, isn’t it), and high bit may be set for pattern fill order (8-bit MVE just compared colour values, which still works)
for some opcodes pattern fill order was changed a bit (i.e. subblocks scan order)
some opcodes meaning was changed completely. Opcode 3 does not requires additional bytes to be read anymore.

I didn’t have a desire to complete it, especially because it’s no fun to debug how motion is stored, so I just hacked existing decoder a bit to decode 16-bit files. Here’s a picture produced by maimed libavcodec/ipvideo.c:

interplay16

Posted in Game Video | 1 Comment »

In the memory of my ThinkPad

April 29th, 2009

I bought my brand ~~new~~refurbished IBM ThinkPad 390 six years ago. While its hardware may be laughable by the current standards – PII-266, 192MB RAM, 4GB HDD – it was my computer where I started developing for FFmpeg. GCC compiling libavcodec/motion_est.c was the reason for adding 128MB to original 64MB of RAM. IIRC, all of codecs development till 2006 GSoC (VC-1 decoder) was done on it.

When I moved to MacMini, it still served me – as a router (it’s hard to see COM port on modern hardware, so modem was connected to TP390, later it was ADSL modem and second PCMCIA network card), as an x86 platform (mostly for running IDA and binary codecs) and for Internet-related stuff (cvs and git server, mail fetching, small web server, downloader and such).

Here’s how it looked for the last years:

i390

Rest in peace.

Now I have Asus EEE 701 working instead of it. Since it’s more compact, I can also fit BeagleBoard on the table next to it.

Posted in Uncategorized | 3 Comments »

A bit of new hardware

March 21st, 2009

I’ve wanted to write another useless rant about idiocy in our lives as a governing policy (for example, 1st class railroad cars being worse than 2nd class but more expensive or how “express” is translated into Ukrainian as “????????????” or “???????????”, both meaning “accelerated” or “sped-up”) but I have a bit of more pleasant news.

I’ve spent the rest of GSoC money on BeagleBoard and it took about 15 days to deliver it (which is rather impressive by local standards). So I hope to start hacking on it too (I’m pretty sure it would be good for both FFmpeg and me if I learn ARM assembly and about NEON unit). In my opinion they would really benefit from having built-in network adapter (there’s a place for it on PCB too) though; since this is not Mac, saying that USB should be enough for everything is rather lame.

Posted in Useless Rants | Comments Closed

Notes on AAC quantisation

March 19th, 2009

I should have written this earlier if not for non-FFmpeg work I have to do here. BTW, are some linguists around there that can explain a relation between bureaucratic and textile (“bureaucracy” comes from a sort of cloth used to cover tables, “red tape” is rather obvious, Russian “????????”, “????????” and “??????????” are also related to a process of obtaining thin threads). Ahem.

AAC coding has two computationally costly operations — MDCT and coefficient quantisation. While the former takes more cycles per one call, the latter is usually called several times for each frame, so those times tend to sum up and outweigh MDCT in bad encoders (like mine). From rate distortion theory we know how to determine proper quantizers for AAC – distortion caused by that quantisation multipled by lambda plus number of bits needed to code that band with this quantiser should be minimal for given value of lambda.

How could we achieve this? Well, use one of three approaches:

Assign some fixed quantizers
Use some ad hoc rule to determine quantiser and then refine its value a bit (aka heuristic, since it gives good speed, it is widely used)
Try all possible quantisers by brute force or Viterbi method (optimal but very slow)

With heuristic you have one catch: if your primary guess on quantiser is not good then refining either takes a lot of time or gives you far from optimal result. Trellis-based search is implemented in my decoder and results in around 20x slower than realtime encoding speed (i.e. encoding one second of audio takes 20 seconds of CPU time) on modern CPUs. I’m playing with something heuristical and fast.

Now to quantising itself.

Each coefficient is quantised as out = (int)pow(in / quantiser, 0.75);. Division of floating-point numbers is slow, taking power of a number is even slower. You can convert MDCT coefficients to the power of three fourths (and quantisers are also converted in precomputed table), thus getting rid of power. FAAC also multiplies coefficients so they are always quantised except for taking integer part. My decoder just multiplies possible codebook vectors by that quantiser and compares it with input coefficients leaving them intact. I also had an idea to present MDCT coefficients in base pow(2, 0.25) making it easy to manipulate but someone still has to test it where base conversions won’t eat all of the gain. I have also tried several optimisations like not trying to match coefficients against all codebook vectors using only close enough vectors. More approaches to try.

(I hope these notes will form “How I Wrote the Best Opensource AAC Encoder Around (to Accompany x264)” memoirs :-S )

Posted in AAC | 2 Comments »

Kostya's Boring Codec World

FFmpeg: providing better alternative since 2000

Bink: pattern-run blocks

Bink: a bunch of peculiarities

Bink: ‘lossless’ block coding

Bink: ‘lossy’ coefficients reading.

Brief notes about Bink

A bit on Interplay MVE 16-bit

In the memory of my ThinkPad

A bit of new hardware

Notes on AAC quantisation

Pages

Archives

Categories

Another Fine Blogs

Multimedia Projects

My E-mail

Meta