About upcoming AV2…

So today I’ve seen an article titled AV2 Video Codec — Early Performance Evaluation of the Research which of course has drawn my attention.

Fun things are that it is a sponsored article and that it’s written by three engineers from ViCueSoft. This is strange, but so far it still looks more promising than the original AV1 feature review article with over 20 authors and too much marketing in it (my review of it is here; and to be fair it was followed by more serious paper with less authors but this one exists as well). Anyway, let’s see what is presented here.

I don’t care about the performance much so I just quote the phrase from the conclusion: “…rough approximation shows only 1.2x times encoding complexity increase and 1.4x time decoding”. I find the increase in decoding complexity being larger than the increase of encoding complexity a bit strange, normally you’d expect encoding difficulty rising faster because of the nature of the coding approach in modern codecs (normally an encoder needs to search for the best combination of encoding tools and their parameters and then apply the same steps as decoder does in order to have a coded frame in the same state as decoder would have it). Let’s look at the features then, it’s the most interesting part to me anyway.

  • distant weighted compound mode and dual interpolation filter are removed;
  • semi-decoupled partitioning is introduced—this feature allows splitting luma and chroma blocks and code their contents independently under certain level. The paper also says there’s Dual Tree feature in VVC that does the same;
  • quantiser step overhaul—instead of six tables in AV1 now you have just one simple formula for all quantiser step;
  • extending motion sample selection to work with compound blocks as well;
  • more partitioning modes to be more like HEVC;
  • multiple reference line selection for intra prediction—allows you to select not just neighbouring row/column for directional intra prediction. The same tool exists in VVC. And it also reminds me of X8 frames in WMV2/WMV9, that is the first case of intra prediction using more than one line known to me;
  • offset-based intra prediction refinement—adding some offset to the top/left intra predicted edge of the block to make it even smoother (the offset is calculated from the neighbouring blocks as well);
  • intra secondary transform—this tool tries to improve compression by applying a special secondary transform to the low-frequency coefficients. VVC has low-frequency non separable transform doing the same;
  • simplifications in intra mode signalling;
  • some improvements in motion prediction coding;
  • cross-component sample offset—another chroma-from-luma tool: for the whole CTU between deblocking and CDEF stages a DC offset is calculated from the luma values and applied to chroma values.

Essentially there are three kinds of improvements: simplification or generalisation of the existing feature (including complete removal of it—I approve either), picking the tool used by VVC/H.266 (that approach works but lacks originality) and an occasional improvement of an existing tool (too few and not too original). Of course nobody knows when AV2 will be declared finished and some things will surely have changed by then, but I don’t expect radical changes.

Once I said that I’ll review H.266 when AV2 is released but these guys has essentially done my work instead of me. Thanks!

Leave a Reply