The war has shifted to a terrorist operation against Ukrainian civilians (with no change for my home city, it gets several strikes from russian territory every day regardless) and instead of threatening the world with nuclear war russia threatens the world with nuclear terrorism using the captured Ukrainian nuclear power plant. So here’s yet another attempt to distract myself from thoughts about it.
It seems that AVS3 has been standardised already (and nobody cares). So out of idle curiosity I’ve downloaded the spec from avs.org.cn
(in Chinese of course, and it requires you to fill some information but accepted any garbage). So let’s look at this completely original format that has not borrowed anything neither from H.266 nor from AV1.
Overall structure seems to be the same as in H.265/H.266 codecs: frames of different types (I/P/B) partitioned into patches (or slices for you non-Chinese) split into coding units that can be recursively split further down, those CU partitions can be predicted either from neighbouring blocks or from blocks in referenced frames and the differences can be transformed somehow and coded with a method inherited from the older codec. Nothing surprising so far.
H.EVC could split partitions into squares or 2NxN/Nx2N rectangles, H.266 added ternary splits in form of “strip of 1/4 width, strip of 1/2 width, strip of 1/4 width”, AV1 added ternary splits like “split block into half and divide one of the halves into blocks again”, AVS3 went further and made “strip of 1/4 width, strip of 1/2 width split in half, strip of 1/4 width” partitioning scheme.
Intra prediction in AVS3 seems to be rather standard: DC, plane, various angles, bilinear and PCM (and the code for intra prediction features MIPSomething
variables which makes me think about H.266 again). Oh, and of course there’s Two-Step Cross-component Prediction Mode which is definitely not CCLM in H.266 or Chroma-from-Luma in AV1.
Inter prediction allows warped motion compensation (with two- or three-vector affine transform, like H.266 or AV1). The overview page also boasts history-based MV prediction (where a table of previously seen motion vectors is maintained and used for prediction) but that requires a much more thorough looking than I’m willing to do.
Coefficient transforms are DCT and DST which makes me immediately think about AV1 (even if transform matrices are different—H.266 uses a different set of transforms). Coefficient coding itself seems to be the same as in AVS2 but I’m too lazy to check that.
Post-filtering consists of SAO and ALF, just like H.266. But there’s a thing that makes AVS3 superior to both H.266 and AV1—it has integrated support for NN filters! It allows to hook up to four filters that can be applied to whole frames, planes or individual coding units in them. And it’s like with film grain in AV1, they’re completely optional and do not affect decoding but the decoder may decide to apply something afterwards. That’s what I call future-proof thinking (and I’m glad it’s not up to me to implement such decoder anyway).
Overall, AVS3 is the expected successor of AVS codecs line: an H.26x rip-off (though like AVS2 it borrowed a bit from AV1 as well) with some original ideas but not too many of them. I’ve looked at it and see no reason to look at it again.
I’m coding new reverse engineering tool in Rust. Feel free to contact me if interested in code.
Sadly now is not the time but I’m looking forward to try it.
Interesting. https://github.com/uavs3/uavs3d decoder has been around for awhile.
Indeed, but nobody cares. Still, for me nothing beats wavelet-based lossless AAC (a Chinese codec standardised by IEEE).