As you might’ve heard, MPEG is essentially no more. And the last noticeable thing related to video coding it did the last was MPEG-5 (and synthesising actors and issuing commands to them with MPEG-G and MPEG-4 standards unholy unity). In result we have an abuse of letter ‘e’—in HEVC, EVC and LCEVC it means three different things. I’ll talk about VVC probably when AV2 specification is available, EVC is slightly enhanced AVC and LCEVC is interesting. And since I was able to locate DIS for it why not give a review of it?
LCEVC is based on Perseus and as such it’s still an interesting concept. For starters, it is not an independent codec but an enhancement layer to add scalability to other video codecs, somewhat like video SBR but hopefully it will remain more independent.
A good deal of specification is copied from H.264 probably because nobody in the industry can take a codec without NALs, SEIs and HRD seriously (I know their importance but here it still feels excessive). Regardless, here is what I understood from the description while suffering from thermal throttling.
The underlying idea is quite simple and hasn’t changed since Perseus: you take a base frame, upscale it, add the high-frequency differences and display the result. The differences are first grouped into 4×4 or 8×8 blocks, transformed with Walsh-Hadamard matrix or modified Walsh-Hadamard matrix (with some coefficients being zeroed out), quantised and coded. Coding is done in two phases: first there is a compaction state where coefficients are transformed into byte stream with flags for zero runs and large values (or RLE just for zeroes and ones) and then it can be packed further with Huffman codes. I guess that there are essentially two modes: a faster one where coefficient data is stored as bytes (with or without RLE) and slightly better compressed mode with those values are further packed with Huffman codes generated per tile.
Overall this looks like a neat scheme and I hope it will have at least some success. No, not to prove Chiariglione’s new approach for introducing new codecs an industry can use without complex patent licensing, but rather because it might be the only recent major video codec built on principles different from H.26x line and its success may introduce more radically different codecs and my codec world will get less boring.
First I’ve heard of LCEVC. I’m so behind on multimedia tech…
I fear that it will be ignored even if it’s MPEG-5 video instead of all those MPEG-H HEVC or MPEG-I VVC. But I guess somebody has to look at the exotic codecs and that’s me.
> I’ll talk about VVC probably when AV2 specification is available,
Isn’t that quite far off? I dont think we will even see final AV2 specification in 2021 judging from the mess they did with AV1.
First of all, I’m no hurry 😉
Since it’s more interesting to compare and contrast several specifications I’m willing to wait until AV2 is ready. Or AVS4, I’m not that picky.
It’s quite cool, and performs well too. One nice benefit is that since you typically use a subsampled H.264 layer at the base, the actual H.264 encoding is very fast (or you can throw –preset placebo at it) and decoding is easier too, which is good for batteries.
The quality gain at typical OTT operating points (2-5 Mbps or so) is really good. Close to HEVC performance in most cases, tho it can be a bit tuning dependent.
From what I heard V-nova has an encoder with tuning presets for different scenarios and they want to release it too so hopefully soon people can try it for themselves.
Is there trick during quantise? I just do Walsh-Hadamard transform, and
quantised by some simple lut, then run length coded, but result is still large size.
Quantisation is the trick—if you apply it then what you should get is mostly zeroes which can be easily compressed with RLE. And mind you, here it tries to compress the differences between the original and down-scaled and up-scaled again frame, and those differences should be small to begin with.
I’ve read through LCEVC documents. But I’m still very confusing about the dequant process. Too much variables like DDbuffer, qm, step_width_modifier, applied_offset etc. Can you give me some hints about the quantisation and dequantisation? I’ve tried to contact v-nova team, but no feedback.
I cannot claim that I studied the LCEVC specification thoroughly and understood it in full but here’s my view on how it should work.
Quantisation can be done to ignore very small near-zero values so you have something like
value = qval > 0 ? qval * quant + offset : (qval < 0 ? qval * quant - offset : 0);
and the area-offset..offset
is called “dead zone”.step_width_modifier
seem to serve similar role.Overall it seems to go this way: you decode
qm_coefficient_{0,1}
, use them to generate or updateQuantScaleDDbuffer
for all three possible modes, which you then use to generate actual quantisation matrix that is applied to single tile during reconstruction along withStepWidthModifier
andapplied_offset
.Why it is so complicated? I guess it’s done to save bits on transmitting full quantisation matrix for each frame. It is a lot like with many codecs where you have 8×8 quantisation matrix (or two, for luma and chroma) transmitted once and then it’s multiplied by a quantiser which may change per macroblock and then used to restore coefficients.
Maybe if you implement a decoder for LCEVC by following the specification you’ll understand how it works. And then encoding is simply a process of writing stuff that decoder understands 😉
Hi James,
I came across your question regarding the de-quantisation in LCEVC. I’m with V-Nova and happy to help with this. The de-quantisation is quite straight forward and many of the parameters are only used to slightly change the behaviour for different configurations of the decoder.
Let me walk you through the main steps of the de-quantisation:
1. As Kostya stated correctly, qm_coefficient_{0,1} are used to update QuantScaleDDBuffer. This avoids the transmission of the full quantisation matrix but allows to change the behaviour of the quantisation if needed.
2. As a next step, the actual quantisation matrix qm is calculated. It depends on the values stored in QuantScaleDDBuffer and the transmitted stepWidth. The stepWidth can be compared to the quantisation parameter (QP) in other codecs.
3. The stepWidthModifier is calculated from the quantisation matrix qm and pre-defined constants.
4. The value appliedOffset is derived from stepWidthModifier, qm and pre-defined constants. The sign of this value depends on the value of the quantised coefficient. It is used as a deadzone in order to ignore values near zero.
5. Finally, the de-quantised coefficient is calculated as out = coeff * (qm + stepWidthModifier) + appliedOffset.
I hope this helps, but happy to explain further!
Hi Florian,
As Kostya suggested that I’ll understand how it works if I implement a decoder for LCEVC step by step. Is there any standard LCEVC bitstream and documents so that I can use it to test my decoder?
Hi James,
Sure! Happy to discuss this further. Just drop me an email at florian.maurer [at] v-nova.com
Thanks!
[…] of how different organisations build essentially the same codec using slightly different methods. MPEG-5 LCEVC was a nice change […]