Archive for the ‘Useless Rants’ Category

About upcoming AV2…

Friday, August 6th, 2021

So today I’ve seen an article titled AV2 Video Codec — Early Performance Evaluation of the Research which of course has drawn my attention.

Fun things are that it is a sponsored article and that it’s written by three engineers from ViCueSoft. This is strange, but so far it still looks more promising than the original AV1 feature review article with over 20 authors and too much marketing in it (my review of it is here; and to be fair it was followed by more serious paper with less authors but this one exists as well). Anyway, let’s see what is presented here.

I don’t care about the performance much so I just quote the phrase from the conclusion: “…rough approximation shows only 1.2x times encoding complexity increase and 1.4x time decoding”. I find the increase in decoding complexity being larger than the increase of encoding complexity a bit strange, normally you’d expect encoding difficulty rising faster because of the nature of the coding approach in modern codecs (normally an encoder needs to search for the best combination of encoding tools and their parameters and then apply the same steps as decoder does in order to have a coded frame in the same state as decoder would have it). Let’s look at the features then, it’s the most interesting part to me anyway.

  • distant weighted compound mode and dual interpolation filter are removed;
  • semi-decoupled partitioning is introduced—this feature allows splitting luma and chroma blocks and code their contents independently under certain level. The paper also says there’s Dual Tree feature in VVC that does the same;
  • quantiser step overhaul—instead of six tables in AV1 now you have just one simple formula for all quantiser step;
  • extending motion sample selection to work with compound blocks as well;
  • more partitioning modes to be more like HEVC;
  • multiple reference line selection for intra prediction—allows you to select not just neighbouring row/column for directional intra prediction. The same tool exists in VVC. And it also reminds me of X8 frames in WMV2/WMV9, that is the first case of intra prediction using more than one line known to me;
  • offset-based intra prediction refinement—adding some offset to the top/left intra predicted edge of the block to make it even smoother (the offset is calculated from the neighbouring blocks as well);
  • intra secondary transform—this tool tries to improve compression by applying a special secondary transform to the low-frequency coefficients. VVC has low-frequency non separable transform doing the same;
  • simplifications in intra mode signalling;
  • some improvements in motion prediction coding;
  • cross-component sample offset—another chroma-from-luma tool: for the whole CTU between deblocking and CDEF stages a DC offset is calculated from the luma values and applied to chroma values.

Essentially there are three kinds of improvements: simplification or generalisation of the existing feature (including complete removal of it—I approve either), picking the tool used by VVC/H.266 (that approach works but lacks originality) and an occasional improvement of an existing tool (too few and not too original). Of course nobody knows when AV2 will be declared finished and some things will surely have changed by then, but I don’t expect radical changes.

Once I said that I’ll review H.266 when AV2 is released but these guys has essentially done my work instead of me. Thanks!

Why codecs are designed like this and why they are not very interchangeable

Monday, August 2nd, 2021

Sometimes I have to explain the role of various codecs and why it’s pointless in most cases to adapt compression tricks from image codecs to audio codecs (and vice versa) and even from lossy to lossless codecs in the same content. If you understand that already then you’ll find no new information here.

Yours truly
Captain Obvious
(more…)

Rust needs proper stand-alone assembler support

Tuesday, July 27th, 2021

Back when I gave my arguments I why don’t consider Rust a mature language, one of those arguments was that is lacks proper assembler support and systems programming language requires it since some of the tasks you need to perform (including optimisation) require as low level access as you can get. Here I would like to argue why asm!{} may be enough for most cases it’s definitely not for mine.
(more…)

Looking for formats to look at

Wednesday, July 21st, 2021

As I mentioned in one of the previous posts, I’ve achieved all the goals for NihAV that I initially set except for trying to write a proper encoder (and no, world domination has never been on my list). Unfortunately this will be not so easy thing to do and I’d like to have a distraction time from time.

Usually I distracted myself with reverse-engineering some format and maybe implementing decoding support for it in NihAV but recently I realized that I ran out of low-hanging fruit. There should be interesting codecs and game containers out there still waiting for their chance but I could not remember anything. I even went through ScummVM code and documented video formats from there in The Wiki, that’s how bored I was.

So I’d be grateful if somebody can point me out to a thing to RE. Last time when Peter drew my attention to VGM/XVD it turned out to be a very fulfilling experience.

Here’s a ship being loaded

Thursday, July 8th, 2021

Since I admitted myself that I feel old, I guess it’s my solemn duty to yell at clouds time from time.

I consider satire to be the most realistic depiction of the world (unless it’s a pasquinade) and as Shakespeare put a phrase in a mouth of one of the King Lear characters, jesters do oft prove prophets.

For example, I still like to re-read various satirical pieces by Ilf and Petrov (probably the best Soviet satirists in the first half of XXth century) and one of those pieces gave a title to this post because of its similarity to what I want to talk about.

That short story mentions a game “loading a ship” (here’s the title) where a group of bored people tries to “load a ship” by calling things starting with the same pre-defined letter—lamps, Lilliputians, locomotives, liquors etc etc—until people can’t recall any more things starting with that letter or somebody suggest a name and people start to argue if that should be allowed. The story itself is about one man who metaphorically loaded a ship by inventing new and new social activities until at some point it was discovered that nobody remembers what are his direct duties should be and he was fired.

And here we transition to the Linux Foundation. Previously I thought it should promote Linux adoption, make some Linux-related standards and pay money to main Linux developers and maintainers so they can work on it full-time. But news in last couple of months showed me I was wrong.

First there was a piece of news about AgStack (Linux for Agriculture). Then there was a piece of news about Linux Foundation organising some unrelated initiative for Microsoft (while not participating in it itself). And finally there’s a piece of news about Open Voice Network, an initiative for ethical standards of voice assistants.

This made me wonder not just why I encounter such news but also what does the foundation really do. The answer did not make me more optimistic.

So it seems that Linux Foundation is transitioning from a foundation that does what I expected it to do to an organisation that does IaaS (initiative-as-a-service): you pay them and they prepare all required infrastructure to have it all running, just invite some organisations to participate. Beside that what are those member companies are paying for? I don’t know the official explanation but to me it looks like three major categories to put it very bluntly and impolitely: bribes, extortion money and indulgences. Bribes is what you pay to affect the politics of the foundation in a way favourable to you (maybe it’s called lobbying, maybe such things do not happen at all—give me facts and I’ll amend this post). Extortion money is what you have to pay to participate in various standardising activity (aka membership fees, no different from any other standardising activity). Indulgences are money you pay to avert attention from your GPL violations regarding the kernel sources (just search for GPL violators and Linux Foundation, you’ll find not just Chinese but American companies mentioned there; at least in one case the known GPL violator hasn’t published its modified kernel sources after becoming a member).

I could not find any reports about the foundation beside their 2020 annual report so I can only speculate how it has changed in the past regarding income, spending and stuff. Yet I can foresee three scenarios on how it may develop in the future:

  1. Shifting to a different activity. Even now actual Linux-related things seem to be less than a half of what the foundation does nowadays, so maybe in the future it will realise that Linux is a legacy they don’t care any longer and maybe even rename themselves to reflect their new main occupation;
  2. Fossilising. Linux Foundation may exist in the future but both its activity and being a member will become a tradition that nobody understands but they’ll keep doing it because it’s a tradition;
  3. Withering. While Linux is relatively popular kernel, nobody can guarantee it will remain popular in the future. Linus himself is not immortal and his successor may be not as skilled, we have IBM trying to replace Linux kernel with systemd while keeping the name—or maybe some other reason will hurt Linux popularity. Or some other kernel will replace it in its major niches (I can easily imagine Android being rebased on Fuchsia; the same may happen to servers as well). In result Linux and Linux-related things will become not interesting to most companies and they’ll drop their financial support.

You may say that there might be a scenario where Linux Foundation will concentrate on Linux-only stuff but I’m sceptical. AgStack is a clear signal they run out of ideas where to promote Linux but in accordance with Parkinson’s law they’ll still keep growing by inventing new stuff as long as there’s enough income to employ more people, organise more conferences and such.

As usual, I’ll be happy to be proved wrong.

On the Origin of Bloatware

Friday, June 11th, 2021

This is inspired by both a private discussion on why modern computing is so complex and my migration from Ubuntu 12.04LTS to systemd 20.04LTS.

Since I’ve finally changed from my less than ten years old operating system to something more modern I’ve noticed that it became noticeably slower (not irritatingly slower though but slower nevertheless) except for Firefox (which is probably not because of JS engine improvements but rather because of native execution of now supported APIs instead of polyfills). And trying various desktop environments before settling on Cinnamon I’m horrified by how bloated and unusable (to me) they are. My friends complain about modern technology demanding more effort to maintain because of complexity and weird interdependencies—while it’s supposed to make your life easier. So why it is like that?

For a keen reader the title of this post contains the answer. For the rest I’ll elaborate it below.
(more…)

Why I still like C and strongly dislike C++

Wednesday, May 26th, 2021

This comes up in my conversations surprisingly often so I thought it’s worth to write my thoughts down instead of repeating them again and again.

As it is common with C programmers, C was not my first nor my last language, but I still like it and when I have to write programs I do it in C. Meanwhile I try to be aware of modern (and not so modern) programming languages and their trends and write my own multimedia-related hobby project in Rust. So why I have not moved to anything else yet and how C++ comes to all this?
(more…)

ZMBV support in NihAV and deflate format fun

Saturday, May 22nd, 2021

As I said in the previous post, I wanted to add ZMBV support to NihAV, mostly because it is rather simple codec (which means I can write a decoder and an encoder for it without spending too much time), it’s lossless and supports various bit-depths too (which means I can encode various content into it preserving the original format).

I still had to improve my deflate support (both decompressing and compressing) a bit to support the way the data is stored there. At least now I mostly understand what various flags are for.

First of all, by itself deflate format specifies just a bitstream split into blocks of data that may contain any amount of coded data. And these blocks start at the next bit after the previous block has ended, no byte aligning except by chance or after a copy block (which aligns bitstream before storing length and block contents).

Then, there is raw format used in various formats (like Zip or gzip) and there’s zlib format used for most cases data is stored as part of some other format (that means you have two initial bytes like 0x78 0x5E and 2×2 bytes of checksum in the end).

So, ZMBV uses unterminated stream format: first frame contains zlib header plus one or several blocks of data padded with an empty copy block to the byte limit, next frame contains continuation of that stream (also one or more blocks padded to the byte boundary) and so on. This is obviously done so you can decode frames one after another and still exploit the redundancy from the previously coded frame data if you’re lucky.

Normally you would start decoding data and keep decoding it until the final block (there’s a flag in block header for that) has been decoded—or error out earlier for insufficient data. In this case though we need to decode data block, check if we are at the end of input data and then return the decoded data. Similarly during data compression we need to encode all current data and pad output stream to the byte boundary if needed.

This is not hard or particularly tricky but it demonstrates that deflated data can be stored in different ways. At least now I really understand what that Z_SYNC_FLUSH flag is for.

Missing optimisation opportunity in Rust

Wednesday, May 12th, 2021

While I’m struggling to write a video player that would satisfy my demands I decided to see if it’s possible to make my H.264 decoder a bit faster. It turned out it can be done with ease and that also raises the question concerning the title of this post.

What I did cannot be truly called optimisations but rather “optimisations” yet they gave a noticeable speed-up. The main optimisation candidates were motion compensation functions. First I shaved a tiny fraction of second by not zeroing temporary arrays as their contents will be overwritten before the first read.

And then I replaced the idiomatic Rust code for working with block like

    for (dline, (sline0, sline1)) in dst.chunks_mut(dstride).zip(tmp.chunks(TMP_BUF_STRIDE).zip(tmp2.chunks(TMP_BUF_STRIDE))).take(h) {
        for (pix, (&a, &b)) in dline.iter_mut().zip(sline0.iter().zip(sline1.iter())).take(w) {
            *pix = ((u16::from(a) + u16::from(b) + 1) >> 1) as u8;
        }
    }

with raw pointers:

    unsafe {
        let mut src1 = tmp.as_ptr();
        let mut src2 = tmp2.as_ptr();
        let mut dst = dst.as_mut_ptr();
        for _ in 0..h {
            for x in 0..w {
                let a = *src1.add(x);
                let b = *src2.add(x);
                *dst.add(x) = ((u16::from(a) + u16::from(b) + 1) >> 1) as u8;
            }
            dst = dst.add(dstride);
            src1 = src1.add(TMP_BUF_STRIDE);
            src2 = src2.add(TMP_BUF_STRIDE);
        }
    }

What do you know, the total decoding time for the test clip I used shrank from 6.6 seconds to 4.9 seconds. That’s just three quarters of the original time!

And here is the problem. In theory if Rust compiler knew that the input satisfies certain parameters i.e. that there’s always enough data to perform full block operation in this case, it would be able to optimise code as good as the one I wrote using pointers or even better. But unfortunately there is no way to tell the compiler that input slices are large enough to perform the operation required amount of times. Even if I added mathematically correct check in the beginning it would not eliminate most of the checks.

Let’s see what happens with the iterator loop step by step:

  1. first all sources are checked to be non-empty;
  2. then in outer loop remaining length of each source is checked to see if the loop should end;
  3. then it is checked if the outer loop has run not more than requested number of times (i.e. just for the block height);
  4. then it checks line lengths (in theory those may be shorter than block width) and requested width to find out the actual length of the inner loop;
  5. and finally inside the loop it performs the averaging.

And here’s what happens with the pointer loop:

  1. outer loop is run the requested amount of times;
  2. inner loop is run the requested amount of times;
  3. operation inside the inner loop is performed.

Of course those checks are required to make sure you work only with the accessible data but it would be nice if I could either mark loops as “I promise it will run exactly this number of times” (maybe via .take_exact() as Luca suggested but I still don’t think it will work perfectly for 2D case) or at least put code using slices instead of iterators into unsafe {} block and tell compiler that I do not want boundary checks performed inside.

Update: in this particular case the input buffer size should be stride * (height - 1) + width i.e. it is always enough to perform operation in the way described above but if you use .chunks_exact() the last line might be not handled which is wrong.

The former is rather hard to implement for the common case so I don’t think it will happen anywhere outside Fortran compilers, the latter would cause conflicts with different Deref trait implementation for slices so it’s not likely to happen either. So doing it with pointers may be clunky but it’s the only way.

Le spam

Saturday, May 1st, 2021

Sometimes I look inside Baidu Mail spam folder to see if there’s anything useful got there by mistake (notifications from various shops with purchase confirmations end there quite often, to give one example). And there’s a weird tendency I’ve spotted recently.

In the last five days I’ve received 47 spam mails. 37 of them were in French. I’m used to receiving spam in various European languages (including but not limited to Bulgarian, German, Italian, Russian and Spanish) but before last year it was mostly in English. Additionally a good deal of them now is about some promotional actions from supermarket chains like Aldi, Carrefour or Lidl (and I’ve never considered either of them to be some luxury store).

What’s wrong with this world?