Archive for the ‘Intermediate Video Codecs’ Category

A look at Winnov WINX

Friday, November 3rd, 2023

It is really a coincidence that about a week after I looked at their Pyramid codec I got reminded that there’s another codec of theirs exists, probably related to the WNV1 codec I REd back in 2005.

So apparently the codec codes YUY2 in 8×8 blocks. Each block is prefixed with a bit telling whether it’s a coded or skipped block. Coded block have additional 4-bit mode that seems to determine which quantisation they’ll use. The data is packed as deltas to the previously decoded value (per-component) using static codebook with values in -7..7 range (plus scaling by shifting left). There’s also an escape value in case raw value should be read instead. Overall it feels like Winnov Video 1 coding.

In other words, nothing remarkable but still a bit more advanced than usual DPCM-based intermediate codecs.

A cursory glance at Daniel codecs

Sunday, July 10th, 2022

Recently Paul B. Mahol turned my attention to the fact there are codecs with human names so why not take a look at them?

The first disappointing thing is that you need to register in order to receive a link for downloading that software. Oh well, throwaway mail services exist and it’s not the first codec I’ve seen doing that (though I’ve forgotten its name already like I’ll forget about this codec). Then it’s the binary size. I remember thinking that Go2Somewhere single 15MB .dll is too much but here’s is even larger (because they’ve decided to bundle a bunch of other decoders and demuxers instead of just one codec).

In either case, what can I say about the codec? Nothing much really. They both seem to be DCT-based intermediate codecs that group blocks into slices. Daniel2 uses larger tiles in slices, probably to accommodate for the wider variety of supported chroma formats (unlike its predecessor it supports different colourspaces, chroma subsamplings and bitdepths). The claimed high (de)coding speed comes from the same approach as in VBLE (does anybody remember it? I still remember how Derek introduced its author to me at one of VDDs). Yes, they simply store coefficients in fixed amount of bits and transmit bit length for each tile to use.

The only really curious thing I’ve found is some combinatorial coding approach I’ve never seen anywhere else. Essentially it stores something like sum in a table and for each value only the number of table entries is transmitted. Actual value is decoded as (max(table[0], ..., table[len - 1]) + min(table[0], ..., table[len - 1]) + 1) / 2 and then the decoded value is subtracted from all table elements used in the calculation. I have no idea why it’s there and what it’s good for but it exists.

Overall, it was not a complete waste of time.

Fun with LGT 5/3 wavelet transform

Saturday, November 20th, 2021

LGT 5/3 wavelet transform is the second simplest lossless wavelet transform (the first one is Haar transform of course) so it’s used in a variety of image formats (most famously in JPEG-2000) and intermediate video codecs. Recently I helped one guy with implementing it and while explaining things about it I understood it myself, so here I’m going to write it down for posterity.

A Quick Look on IMM4

Sunday, April 10th, 2016

So I’ve spent an hour or so to look at IMM4.

What do you know, it’s a very simple IDCT codec with interframes. Intraframes have only DCT with usual run-level VLC coding, interframes have skip flag to tell whether this macroblock should be skipped or there’s a difference to the previous frame coded or intra block. See, no motion vectors, quantisation is single value per block (except for DC in intra block), there seems to be no zigzagging either. You cannot get much simpler than that.

A Codec Family Proposal

Monday, September 29th, 2014

There are enough general use standardised codecs, there’s even VPx family for those who want more. But there are not enough niche codecs with free/open specifications.

One of such niche codecs would be an intermediate codec. It’s suitable for capturing and quick editing of video material. Main requirements are modest compression rate and fast processing (scalable is a plus too). Maybe SMPTE VC-5 will be the answer, maybe Ogg Chloe, maybe something completely different. Let’s discuss it some other time.

Another niche codec that desperately needs an open standard is screen video codec. Such codec may be also used for recording webcasts, presentations and such. And here I’d like to discuss a whole family of such codecs based on the same coding principles.

It makes sense to make codec fast by employing multithreading where possible. That’s why frame should be divided into tiles that should be not so large and not so small, maybe 192×128 pixels or so.

Each tile should be coded independently, preferably its distinct features coded separately too. It makes sense to separate tile data into smooth features (like gradients and real life pictures) and sharp transitions (like text and UI elements). Let’s call the former a natural layer and the latter a synthetic layer. We’ll need a mask to tell which layer to use for the current pixel too. And using these main blocks and employing different coding methods we can make a whole family of codecs.

Here’s the list of example codecs (with a random FOURCC assigned):

  • J-B0 — employ JPEG for natural layer and GIFPNG for mask and synthetic layer coding;
  • J-B1 — employ Snow for natural layer coding and FFV1 for synthetic layer coding;
  • J-B2 — employ JPEG-2000 for natural layer coding, JBIG for mask coding and something like PPM modeller for synthetic layer;
  • J-BG — employ WebP for natural layer and WebP LL for synthetic layer.

As one can see, it’s rather easy to build such codec since all coding blocks are there and only natural/synthetic layer separation might need a bit of research. I see no reasons why, say, VLC can’t use it for recording and streaming desktop for e.g. virtual meeting.

Some Notes on Some Intermediate Codec Family

Monday, January 27th, 2014

A friend of mine Mario asked me to look at DNxHD 444. It turned out to be quite easy to support in libavcodec decoder (at least for CID 1256 for which I has sample) after I looked at the binary decoder. And I was curious what formats were there.

Here is the list of internal IDs supported by Avid decoder with a family they belong to, image parameters (width x height @ bitdepth) and other properties.

  • 1233: Avid_HD (1920×1080@10) interlaced (marked as debug format)
  • 1234: Avid_HD (1920×1080@10) interlaced (marked as debug format)
  • 1235: Avid_HD (1920×1080@10) progressive
  • 1236: Avid_HD (1920×1080@10) progressive (marked as debug format)
  • 1237: Avid_HD (1920×1080@8) progressive
  • 1238: Avid_HD (1920×1080@8) progressive
  • 1239: Avid_HD (1920×1080@8) interlaced (marked as debug format)
  • 1240: Avid_HD (1920×1080@8) interlaced (marked as debug format)
  • 1241: Avid_HD (1920×1080@10) interlaced
  • 1242: Avid_HD (1920×1080@8) interlaced
  • 1243: Avid_HD (1920×1080@8) interlaced
  • 1244: Avid_HD (1440×1080@8) interlaced
  • 1250: Avid_HD (1280×720@10) progressive
  • 1251: Avid_HD (1280×720@8) progressive
  • 1252: Avid_HD (1280×720@8) progressive
  • 1253: Avid_HD (1920×1080@8) progressive
  • 1254: Avid_HD (1920×1080@8) interlaced
  • 1256: DNx444 (1920×1080@10) progressive
  • 1257: DNx444 (1920×1080@10) interlaced
  • 1258: DNx100 (960×720@8) progressive
  • 1259: DNx100 (1440×1080@8) progressive
  • 1260: DNx100 (1440×1080@8) interlaced
  • 32768: AHD-DBG-1 Avid_HD (64×32@8) interlaced
  • 32769: AHD-DBG-2 Avid_HD (128×128@8) interlaced
  • 32770: AHD-DBG-3 Avid_HD (480×320@8) interlaced
  • 32771: AHD-DBG-4 Avid_HD (64×32@10) interlaced
  • 32772: AHD-DBG-5 Avid_HD (128×128@10) interlaced
  • 32773: AHD-DBG-6 Avid_HD (480×320@10) interlaced
  • 36864: AHD-DBG-3 Avid_HD (720×512@8) interlaced

If you look at this table you can see more formats than supported by libavcodec currently. Unsupported formats being debug ones, interlaced ones and not belonging to Avid_HD family.

While I fully approve not having interlaced formats support, the rest can be supported (especially if samples are provided).

Sigh, too much intermediate codecs I had looked at.

Sheer Madness

Wednesday, May 22nd, 2013

(luckily there’s not much left of this month of intermediate codecs)

So I’ve looked at another intermediate codec, post title hints on both its name and design. Coding scheme is rather simple: you code lines either in raw form or with prediction (from the left neighbour for the top line or (3 * L + 3 * T - 2 * TL) >> 2 for other lines, prediction error is coded with fixed Huffman codes.

Simple, right?

Here’s the catch: there is an insane number of formats it supports, both for storage and output and there’s an insane number of decoding functions for decoding format X into format Y.

So quite probably no decoder — not interesting and too tedious.

ProRes alpha support is almost there

Friday, May 17th, 2013

I’ve finally brought myself into looking at alpha plane decoding support for ProRes. It was a bit peculiar but rather easy to reverse engineer. Now I only need to update my ConsumerRes decoder to support it.

And that’s probably enough for the month of intermediate codecs.

A Well-designed Intermediate Codec

Sunday, May 12th, 2013

The adjective is referring to the hype that the company that made this codec is run by designers (unlike some other companies where even design is made by developers or — even worse — marketers). And let’s call it AWIC or iNtermediate codec for short. Let’s not mention its name at all.

It is a rather old codec and it codes 8-bit YUV420 in 16×16 macroblocks with DCT, quantisation and static codes. Frame is divided into slices in such way so that there are not more than 32 slices on one line (and slice height is one macroblock). The main peculiarity is having scalable mode — every macroblock is partitioned into 8×8 sub-macroblock (i.e. 8×8 luma block and two 4×4 chroma blocks) with the following data for the rest of the block, and this is exploited for decoding frames in half-width, half-height or half-width half-height modes.

Maybe I’ll write a decoder for it after all.

Final Words on Canopus HQ, HQA and HQX

Thursday, May 9th, 2013

Astrologers proclaim month of intermediate codecs. Number of blog posts about intermediate codecs doubles! (from zero)

Let’s look at Canopus codecs and their development.

Canopus Lossless

This is very simple lossless video codec: you have code tables description and coded difference from the left neighbour (or top one for the first pixel) for each component. For RGBA and YUV there are slight improvements but the overall coding remains the same.

Canopus HQ

This is an ordinary intermediate codec employing IDCT with 16×16 macroblocks in 4:2:2 format and interlacing support.
It has predefined profiles with frame sizes (from 160×120 to 1920×1080), number of slices and macroblock shuffling order (yes, like DV it decodes macroblocks in shuffled order).

Block coding is nothing special but quantising is. For every macroblock there can be one of sixteen quantiser sets selected. And for each block one of four quantising matrices can be selected from that set. Of course there are different quantiser matrices for luma and chroma (i.e. 128 quantising matrices in total, about 80 of them are unique).

Interlacing is signalled per block (in case it’s enabled for the frame).

Canopus HQA

This is Canopus HQ with alpha support. The main differences are flexible frame size (no hardcoded profiles), alpha component in macroblocks and coded block pattern. Coding and tables seem to be the same as in HQ.

Coded block pattern specifies which of 4 luma blocks are coded (along with corresponding alpha and chroma blocks). Uncoded blocks are filled with zeroes (i.e. totally transparent).

Canopus HQX

This codec combines both previous codecs and extends them for more formats support. While HQ was 4:2:2 8-bit, this one can be 4:2:2 or 4:4:4 and 9-, 10- or 11-bit support (with or without alpha).

There are changes in overall and block coding.

Frame is now partitioned into slices of 480 macroblocks and every 16 macroblocks are shuffled.

Blocks now have more adaptive coding. DCs are coded as the differences to the previous ones (inside macroblock component) and instead of being coded as 9-bit number they are now Huffman-coded and table is selected depending on component bit depth. Quantisation is split: now there’s a selectable quantiser and two quantiser matrices (for luma/alpha and chroma). AC codes are selected depending on quantiser selected for the block. So there are less quantiser matrices (two instead of seventy eight) but more VLC tables (CBP + 3 DC + 6 AC tables instead of CBP and single AC table).


Reverse engineering all those formats was obvious because they are not complex, obfuscated or C++ (which is usually both).

Shall I write decoders for them? Unlikely. The codecs are not too interesting (I’ve seen only one Canopus HQ sample and no HQA or HQX samples at all) and rather tedious to implement because of all those tables. And we have Canopus Lossless decoder already.