Vector Quantisation Codecs are Still not (semi-kinda) Dead!

April 16th, 2015

While golden days of vector quantisation codecs seem to be over (Cinepak, Smacker and such) there’s still one quite widespread use of vector quantisation in video — texture compression. And, surprisingly, there’s a couple of codecs that employ texture compression methods (good for GPU acceleration, less stuff to invent etc.) like Vidvox Hap or Resolume DXV (which looks suspiciously similar in many aspects but with some features like LZ4, LZF or YCoCg10 compression added). I have not looked that closely at either of them but looks like they still operate on small blocks, it’s just e.g. compressing each plane 8×8 block with BC4 and combining them later.

This does not seem that much interesting to me but I’m sure Vittorio will dig deeper. Good luck to him!

P.S. I forgot — which version of Firefox comes with ORBX.js support?

A Bit about Actimagine Nerve Agent

April 12th, 2015

Codecs are sometimes named after really ridiculous things. Actimagine has named it after nerve agent. Or Panasonic VCR tape format that’s only Wickedpedia has heard about. But I bet on nerve agent (if you didn’t have to study chemical warfare weapons at school then you’re not born in the USSR and be thankful for that).

First of all, I don’t know much about VX except that it was used on game consoles. Also judging by the code it was intended to be used with really low resolutions because it had stride hardcoded to 256 from what I’ve seen.

It reminds me of Mobiclip HD somewhat. I’m too lazy to investigate all details (because I only have an ARM disassembly of some binary with some helpful comments) but here’s what I could find after spending an hour or two on it.

Video codec employs exp-Golomb codes for most things — signed and unsigned ones, bitreader limits them to 16 bits at most. Again, there’s not much I can say about overall structure except that it looks like simplified H.264/H.265 (though it was obviously well before H.265 times) — there is spatial prediction (four modes only: horizontal/vertical, DC and plane) and there’s macroblock subdivision too (into square blocks only, possible sizes are 4×4, 8×8 or 16×16). It still looks like there’s one motion vector per macroblock with motion vector deltas for subblocks.

Again, noone cares.

A Short Guide to Julmust/Påskmust

April 11th, 2015

Unfortunately I was not able to visit Sweden this Easter season — it was merely 6 days in Stockholm. Yet I’ve managed to try one of the reasons I come to Sweden — påskmust. For those who don’t know what it is — shame on you! For the rest here’s my incomplete and biased guide.

julmust
Some old julmust photo. Left to right: Nygårda, Eldorado (Hemköp), ICA, Coop, Wasa, Apotekarnes. Lying is the Lidl julmust.

img_7247
Some old påskmust photo (probably from 2011). Left to right: Mora, Nygårda, Apotekarnes, ICA, probably Lidl, Eldorado (Hemköp), Coop. Lying are ordinary and special Wasa påskmust. Front bottle is from Guttsta Källa.

IMG_4282
This year catch. Back row: Wasa special, ICA, Apotekarnes. Front row: Nyckelbryggeri, Zeunerts, Grebbestads bryggeri, Mora, Nygarda, Danish abomination.

So one can divide julmust/påskmust into four categories:

  1. Widespread must from large producers or supermarket chains. That includes Apotekarnes, Nygårda and must made for Coop, Hemköp, ICA and Lidl. But not for Netto, see category four for that.
  2. Must from Norrland breweries. Nyckelbryggeri, Wasa and Zeunerts are most known. And maybe Mora.
  3. Must from non-Norrland breweries. Guttsta Källa, Grebbestads, Hammars (I have to try that one yet).
  4. Abominations from people who don’t know how to make proper must. That includes Bjäre must from C*ca-cola, Harboe must from Netto (made in Denmark) and whatever Danish stuff I tried this year. Concentrate for making must at home probably belongs here too.

The taste is hard to describe but it’s really nice and makes me think of liquid bread for some reason. The main difference is Norrland/non-Norrland must. Julmust and påskmust in Norrland style is less sweet and usually has a hint of coffee. Must from large producers is usually sweeter than the rest. Wasa bryggeri produces two kinds of must — special, available only in Norrland and made after Norrland traditions, and ordinary, available in Svealand and with taste closer to the more widespread varieties.

Danish must is either bland or plainly wrong. The one I tried this year is not actually bad, it’s just completely wrong — it contains e.g. raspberry juice and cola extract. If I drink påskmust I want to be påskmust, not a weird mix of Pommac and Trocadero that probably has only water and sugar in common with the other påskmust recipes.

And now is the actual guide. If you want to try it then start with widespread påskmust you can find in any Swedish supermarket, it should be fine. If you like it that way then be happy, if you want something less sweet then try smaller breweries and especially Norrland ones (it’s hard to find it outside Norrland though). And if you are not in season then you can still try something similar — bordsdricka from Wasa or sommarmust from Nyckelbryggeri should be available (in Norrland).

P.S. You can extrapolate it to Trocadero as well except there’s less variation in taste and there’s no supermarket or Danish version.

Some Notes on Lossless Video Codecs

March 21st, 2015

While reading a nice PhD thesis from 2014 (in Russian) about new lossless video compression method I laughed hard at lossless video codec descriptions (here’s the link for unbelievers – http://gorkoff.ru/wp-content/uploads/articles/dissertation.pdf, translation below is mine):

To date various lossless videostream compression methods and algorithms have been developed. They are used in the following widespread codecs:

* CorePNG employs deflate algorithm for independent compression of every frame. Theoretically the codec supports delta frames but this option is not used.

* FFV1 employs prediction coding with following entropy coding or prediction error.

* Huffyuv, like FFV1 algorithm, employs predictive coding but prediction error is effectively coded with Huffman algorithm.

* MSU Lossless Video Codec has been developed for many years at Moscow State University labs.

And yet some real world tasks demand more effective compression and thus a challenge of developing new more effective lossless video compression methods still remains actual.

Readers are welcome to find inaccurate, erroneous and outright bullshit statements in this quote themselves. I’d rather talk about lossless video codecs as I know them.
Read the rest of this entry »

Some notes on VP4

March 1st, 2015

Well, this information should’ve been posted by someone else but those people seem to be lazier than me. In return I’m not going to use XViD or FLIC for encoding my content.

So, REing VP4 is rather easy – you just download original VP3.2 decoder source (still available at Xiph SVN servers) and compare it to the structure in vp4vfw.dll. There are differences in structures and a bit in code layout but mostly it’s the same code with new additions.

So, VP4 is based on VP3 (surprise!) and introduces a new bitstream version (which is 3 for some reason). Here’s an incomplete list of differences I’ve spotted:

  • Base frame header has some additional fields (I didn’t care enough to decipher their meaning though);
  • Superblock coding uses a bit different scheme with new universal codes resembling exp-Golomb but with VP4 quirk;
  • Frame data decoding differs for frame types;
  • Motion vector component extraction uses Huffman tables and sign from the previous block.

And yet it uses the same coding principles and even token coding seems to be left untouched. It was suspected for a long time that even-numbered On2 codecs were simply an improvements over previous version while odd-numbered On2 codecs were more innovative but not much was known about VP4 to prove it:

  1. Duck TrueMotion 1 — a new codec;
  2. Duck TrueMotion 2 — mostly like TrueMotion 1 but with Huffman encoding;
  3. Duck/On2 TrueMotion VP3 — DCT + static Huffman coding;
  4. On2 TrueMotion VP4 — VP3 with some bitstream coding changes;
  5. On2 TrueCast VP5 — DCT + arithmetic coder;
  6. On2 VP6 — VP5 with some bitstream changes;
  7. On2 VP7 — H.264 ripoff with their own arithmetic coder;
  8. On2 VP8 — VP7 with some small changes;
  9. Baidu VP9 — H.265 ripoff with their own arithmetic coder;
  10. rumoured Baidu VP10 — since there’s no H.266 in the works for now…

It’s all kinda Intel CPUs but without confusing codenames (and Xiph hasn’t produced too many codecs to confuse whether Daalawell came before Theorabridge or after).

P.S. Many thanks to big G for releasing no information on that codec or any other codecs from On2. Oh, and is VP9 “specification” still under NDA?

P.P.S. I should really work on a game codec named after chemical warfare instead.

A Call for Modern Audio Codec

February 11th, 2015

We need a proper audio codec to accompany state of the art video codecs, so here’s an outline of codec features that should be present:

  • audio codec should make more of its context, it should have a system of forward and backward reference frames like B-pyramid in H.264 or H.265;
  • it should employ tonal compensation with that — track the frequency changes from the references (e.g. it may be the same note continued or changing pitch);
  • time domain prediction via FIR or IIR filters;
  • flexible subdivision into subframes like binary tree;
  • raw (or non-transformed at least) coding mode for transients or noise;
  • integer only bitexact transform that passes for MDCT under bad light;
  • high-bitdepth sound support (up to 64 bits per sample).

The project name is transGhost (hopefully no Monty will be hurt by this).

And if you point out this is stupid — well, audio codecs should have the same rights as video codecs including PTS/DTS differences and employing similar coding methods.

Why one should not be overexcited about new formats

January 10th, 2015

Today I’ll talk about Opus and BPG and argue why they are not the silver bullets everyone was expecting.

Opus

I cannot say this is a bad codec, it has modern design (hybrid speech+music coder) and impressive performance. What’s wrong about it? Usage.

The codec is ideal for streaming, broadcasting and such. It does not have special multichannel audio, you can combine mono and stereo Opus streams in whatever way you like and you don’t have to care about passing special configuration for it in a special way.

What’s bad about that? When you try to apply it to stored media all those advantages turn into drawbacks. There was no standard way to store it (IIRC Opus-in-TS and Opus-in-MP4 specifications were developed by people that had little in common with Opus developers although some of the latter were present too). There is still one big problem with an ugly hack as “solution” — the lack of keyframes in Opus and the “solution” in form of preroll (i.e. “decode certain number of audio frames before the needed one and discard them”). And not all containers support that feature.

That reminds me of MoosePack SV1-SV7. That was a project intended to improve MPEG Audio Layer II compression and make it a new codec (yes, there’s Layer III but that was one of the reasons MoosePack, Vorbis and other audio codecs were born). It had enjoyed some limited popularity (I’ve implemented MPC decoding support for a reason) but it had two major drawbacks:

  • very brief file format — IIRC it’s just a header and audio blocks prefixed by 20-bit size and no padding to byte either (if you’ve ever worked with raw FLAC streams you should have no problems imagining how good MPC format was);
  • no intra frames — again, IIRC their solution was to simply decode and discard 12 frames before the given one in hope the sound will converge.

MusePack SV8 tried to address all those issues by making new chunked format that could be easily embedded into other containers, its audio blocks could be decoded independently because first frame in it was a keyframe. But it was too late and I don’t know who uses this format at all.

Opus is more advanced and performs better by offloading those problems to container but I still don’t think Opus is an ideal codec for all cases. If you play it continuously it’s fine, when you try to seek problems start to occur.

BPG

This is quite recent example of the idea “let’s stick intraframe coding from some video codec into image format”.

Of course such approach saves time especially if you piggyback state of the art codec but it’s not the optimal solution. Why? Because still image coding and video sequence coding have different goals and working conditions.

In video coding you have a large amount of data that you have to (de)compress efficiently but mostly under specific constraints like framerate. While coding an individual frame is important it’s much more convenient to spend efforts on evening load for decoding all frames. After all, hardly anyone would like to have first frame to be decoded in 0.8s and other 24 frames in 0.1s. That reminds me of ClearVideo which had the inverse problem – intraframes were coded very simply (just IDCT+static Huffman) and interframes employed something fractal and took much more time.

Another difference is content. For video you usually have common frame sizes (like 1920×1080 or 1280×768) and actually modern video codecs are targeted for handling bigger and bigger resolutions. Images on the other hand come in various sizes, even ridiculous ones like 173×69, and they contain stuff you usually don’t expect to be in video form — pixel art, synthetic images, line art etc. (Yes, some people care about monochrome FMV but it’s a very rare case).

Another problem is efficient coding of palettised and monochrome images, lossy or losslessly. For lossless compression it’s much better to operate on whole lines while video coding standards nowadays are block-based and specialised compression schemes beat generic ones. For instance, the same test page compresses to 80kB PNG, 56kB Group4 TIFF or 35kB JBIG image. JPEG-LS beats PNG too and both are very simple compression standards compared to even H.261.

There’s also alpha plane coding, not so many video codecs support it because of its limited use in video. You have it mostly in intermediate codecs or game ones (hello Indeo 4!). So if selected video codec doesn’t support alpha natively you have to glue it somehow (that’s what BPG does).

Thus, we come to the following points:

  • images are individually coded while video codec has to care about whole sequence;
  • images come in different sizes, video sizes are usually few standard ones;
  • images have different content that’s not always well compressed by video coder and specialised compression scheme is always better and maybe faster;
  • images might need some additional features not required by video.

This should also explain why I have some respect for WebPLL but none for WebP.

I’ve omitted obvious problems with adoption, small-power hardware and such because hardly anything beats (M)JPEG there. So next time you choose format for images choose wisely.

H.265: An Alternative History

December 6th, 2014

As you might remember, alternative history genre is modelling events based on real history but with something gone differently. Here’s what could’ve happened with H.265 but didn’t.

So, finally there’s a new standard released — ITU H.265. It promises twice as low bitrate for the same picture quality in H.264. Yet people do not care much about it since industry leaders offer their solutions:

China introduces their new standard for video coding — Hybrid Enhanced Video Standard or HEVS for short. It features quadtree representation of coding blocks, more than thirty spatial prediction modes, block transforms from 4×4 to 32×32 and has one unique feature — motion vectors that implicitly take mirror references from reference picture lists. This standard is nominated for main video coding standard on CUVRD (China ultraviolet ray disc) but gains little popularity outside China.

On2 makes a new codec named VP9 that has no open specifications. After tedious reverse engineering it turns out to employ coding scheme from VP5 times, spatial and motion prediction from H.265 with slightly altered coefficients and overall coding scheme from H.265 drafts.

Re… buffering… alNetworks releases NGV at last (fourcc RV50). Again, after long studies of binary specification, it turns out to be based on H.265 drafts with some in-house improvements: using context-specific codebooks for elements coding, ?-pel motion compensation (which is implemented as motion vectors pretended to be ?-pel but in reality several different positions are handled by the same function). The codec goes very popular in China for some reason.

Sorenson releases SVQ7. It is based on old H.265 draft and employs ?- and ?-pel motion compensation. It has some additional features like watermarking and quickly becomes the codec of choice for QuickTime.

P.S. Good thing nothing like this has really happened.

Blåtand-Passande-X

November 23rd, 2014

So, finally there’s a post about some codec.

It is a specialised codec from Oxford Germanium Television (all names are changed just in case) that has 4:1 compression ratio and very niche use. It’s hard to find even a decoder for it so this analysis was done on ARM version of encoder (maybe I’ll be able to RE something more useful next time like VX).

The codec itself is rather simple: you take 4 samples from one channel, compress them, output the 16-bit result and repeat the same for the second channel. Encoding is rather simple too:

  1. feed input to 4-band QMF (with filter looking a lot like D4 wavelet to me);
  2. perform ADPCM on each band (this varies a bit for each band but it’s the same approach);
  3. generate output word (7 bits for band 0, 4 bits for band 1, 2 bits for band 2 and 3 plus a parity bit for them all).

Since I have no samples of it don’t expect a decoder from me any time soon (and I don’t have enough motivation to hook Android encoder directly to make it produce data). Not that anyone cares about it either.

A Bit on Germany

November 21st, 2014

quote

An excerpt from a book that I have to refer sometimes (here’s the source, it really tells a lot about proper relationship. Ask a nearby German for a translation if you need it.

P.S. Next post will be about a codec technical description, I promise.