Archive for the ‘Useless Rants’ Category

A look at another game codec

Monday, October 4th, 2021

This morning The Multimedia Mike told me he’s found yet another undiscovered game video codec used in some games by Imagination Pilots (probably it’s in no way related to the fact the codec has FOURCCs IPMA and IP20). Surprisingly there’s a fandom that has REd most of the game formats and made a reasonable assumption that the codec would use LZW in the same way as image resources do.

And it turns out they were right. Frames in both versions of the codec are raw image buffers compressed with LZW. Pixel value 0 is used there for transparency and considering that those animations may use a transparent background as well there’s no difference between frame types at all.

So people who REd the rest of the formats just missed the final step and if they had frame extractor they could simply dump frame data, try to decompress it and see the result. You don’t need to look inside the binary specification for that (I did and it was not that useful even if I could recognize LZW decompression functions there).

The rest of the post I want to dedicate to ranting about Ghidra failing to decompile it properly.

Is VP8 a Duck codec?

Friday, October 1st, 2021

There’s a blog out there with posts dedicated to the history of On2 (née Duck). And one particular post (archived version) brought an unsettling thought that refuses to leave me. Does VP8 belong to Duck or Baidu (yes, I’ll keep calling this company by value) codecs?

Arguments for Duck theory:

  1. it was released in 2008, before acquisition (which happened in 2010);
  2. it can be seen as an improvement of VP7, which is definitely a Duck codec;
  3. its documentation is as lacking as for the previous codecs.

Arguments for Baidu theory:

  1. it became famous after the company was bought and the codec was open-sourced;
  2. as a follow-up from the previous item, there is an open-source library for decoding and encoding it (I think the previous source dump had an encoder just for TMRT and maybe it was an oversight);
  3. it has its own ecosystem (all previous codecs were stored in AVI, this one uses WebMKV);
  4. I don’t have to implement it in NihAV (because I wanted nihav_duck crate to contain decoders for all Duck formats and if VP8 is not really a Duck codec I don’t have to do anything).

So, what do you think?

About upcoming AV2…

Friday, August 6th, 2021

So today I’ve seen an article titled AV2 Video Codec — Early Performance Evaluation of the Research which of course has drawn my attention.

Fun things are that it is a sponsored article and that it’s written by three engineers from ViCueSoft. This is strange, but so far it still looks more promising than the original AV1 feature review article with over 20 authors and too much marketing in it (my review of it is here; and to be fair it was followed by more serious paper with less authors but this one exists as well). Anyway, let’s see what is presented here.

I don’t care about the performance much so I just quote the phrase from the conclusion: “…rough approximation shows only 1.2x times encoding complexity increase and 1.4x time decoding”. I find the increase in decoding complexity being larger than the increase of encoding complexity a bit strange, normally you’d expect encoding difficulty rising faster because of the nature of the coding approach in modern codecs (normally an encoder needs to search for the best combination of encoding tools and their parameters and then apply the same steps as decoder does in order to have a coded frame in the same state as decoder would have it). Let’s look at the features then, it’s the most interesting part to me anyway.

  • distant weighted compound mode and dual interpolation filter are removed;
  • semi-decoupled partitioning is introduced—this feature allows splitting luma and chroma blocks and code their contents independently under certain level. The paper also says there’s Dual Tree feature in VVC that does the same;
  • quantiser step overhaul—instead of six tables in AV1 now you have just one simple formula for all quantiser step;
  • extending motion sample selection to work with compound blocks as well;
  • more partitioning modes to be more like HEVC;
  • multiple reference line selection for intra prediction—allows you to select not just neighbouring row/column for directional intra prediction. The same tool exists in VVC. And it also reminds me of X8 frames in WMV2/WMV9, that is the first case of intra prediction using more than one line known to me;
  • offset-based intra prediction refinement—adding some offset to the top/left intra predicted edge of the block to make it even smoother (the offset is calculated from the neighbouring blocks as well);
  • intra secondary transform—this tool tries to improve compression by applying a special secondary transform to the low-frequency coefficients. VVC has low-frequency non separable transform doing the same;
  • simplifications in intra mode signalling;
  • some improvements in motion prediction coding;
  • cross-component sample offset—another chroma-from-luma tool: for the whole CTU between deblocking and CDEF stages a DC offset is calculated from the luma values and applied to chroma values.

Essentially there are three kinds of improvements: simplification or generalisation of the existing feature (including complete removal of it—I approve either), picking the tool used by VVC/H.266 (that approach works but lacks originality) and an occasional improvement of an existing tool (too few and not too original). Of course nobody knows when AV2 will be declared finished and some things will surely have changed by then, but I don’t expect radical changes.

Once I said that I’ll review H.266 when AV2 is released but these guys has essentially done my work instead of me. Thanks!

Why codecs are designed like this and why they are not very interchangeable

Monday, August 2nd, 2021

Sometimes I have to explain the role of various codecs and why it’s pointless in most cases to adapt compression tricks from image codecs to audio codecs (and vice versa) and even from lossy to lossless codecs in the same content. If you understand that already then you’ll find no new information here.

Yours truly
Captain Obvious

Rust needs proper stand-alone assembler support

Tuesday, July 27th, 2021

Back when I gave my arguments I why don’t consider Rust a mature language, one of those arguments was that is lacks proper assembler support and systems programming language requires it since some of the tasks you need to perform (including optimisation) require as low level access as you can get. Here I would like to argue why asm!{} may be enough for most cases it’s definitely not for mine.

Looking for formats to look at

Wednesday, July 21st, 2021

As I mentioned in one of the previous posts, I’ve achieved all the goals for NihAV that I initially set except for trying to write a proper encoder (and no, world domination has never been on my list). Unfortunately this will be not so easy thing to do and I’d like to have a distraction time from time.

Usually I distracted myself with reverse-engineering some format and maybe implementing decoding support for it in NihAV but recently I realized that I ran out of low-hanging fruit. There should be interesting codecs and game containers out there still waiting for their chance but I could not remember anything. I even went through ScummVM code and documented video formats from there in The Wiki, that’s how bored I was.

So I’d be grateful if somebody can point me out to a thing to RE. Last time when Peter drew my attention to VGM/XVD it turned out to be a very fulfilling experience.

Here’s a ship being loaded

Thursday, July 8th, 2021

Since I admitted myself that I feel old, I guess it’s my solemn duty to yell at clouds time from time.

I consider satire to be the most realistic depiction of the world (unless it’s a pasquinade) and as Shakespeare put a phrase in a mouth of one of the King Lear characters, jesters do oft prove prophets.

For example, I still like to re-read various satirical pieces by Ilf and Petrov (probably the best Soviet satirists in the first half of XXth century) and one of those pieces gave a title to this post because of its similarity to what I want to talk about.

That short story mentions a game “loading a ship” (here’s the title) where a group of bored people tries to “load a ship” by calling things starting with the same pre-defined letter—lamps, Lilliputians, locomotives, liquors etc etc—until people can’t recall any more things starting with that letter or somebody suggest a name and people start to argue if that should be allowed. The story itself is about one man who metaphorically loaded a ship by inventing new and new social activities until at some point it was discovered that nobody remembers what are his direct duties should be and he was fired.

And here we transition to the Linux Foundation. Previously I thought it should promote Linux adoption, make some Linux-related standards and pay money to main Linux developers and maintainers so they can work on it full-time. But news in last couple of months showed me I was wrong.

First there was a piece of news about AgStack (Linux for Agriculture). Then there was a piece of news about Linux Foundation organising some unrelated initiative for Microsoft (while not participating in it itself). And finally there’s a piece of news about Open Voice Network, an initiative for ethical standards of voice assistants.

This made me wonder not just why I encounter such news but also what does the foundation really do. The answer did not make me more optimistic.

So it seems that Linux Foundation is transitioning from a foundation that does what I expected it to do to an organisation that does IaaS (initiative-as-a-service): you pay them and they prepare all required infrastructure to have it all running, just invite some organisations to participate. Beside that what are those member companies are paying for? I don’t know the official explanation but to me it looks like three major categories to put it very bluntly and impolitely: bribes, extortion money and indulgences. Bribes is what you pay to affect the politics of the foundation in a way favourable to you (maybe it’s called lobbying, maybe such things do not happen at all—give me facts and I’ll amend this post). Extortion money is what you have to pay to participate in various standardising activity (aka membership fees, no different from any other standardising activity). Indulgences are money you pay to avert attention from your GPL violations regarding the kernel sources (just search for GPL violators and Linux Foundation, you’ll find not just Chinese but American companies mentioned there; at least in one case the known GPL violator hasn’t published its modified kernel sources after becoming a member).

I could not find any reports about the foundation beside their 2020 annual report so I can only speculate how it has changed in the past regarding income, spending and stuff. Yet I can foresee three scenarios on how it may develop in the future:

  1. Shifting to a different activity. Even now actual Linux-related things seem to be less than a half of what the foundation does nowadays, so maybe in the future it will realise that Linux is a legacy they don’t care any longer and maybe even rename themselves to reflect their new main occupation;
  2. Fossilising. Linux Foundation may exist in the future but both its activity and being a member will become a tradition that nobody understands but they’ll keep doing it because it’s a tradition;
  3. Withering. While Linux is relatively popular kernel, nobody can guarantee it will remain popular in the future. Linus himself is not immortal and his successor may be not as skilled, we have IBM trying to replace Linux kernel with systemd while keeping the name—or maybe some other reason will hurt Linux popularity. Or some other kernel will replace it in its major niches (I can easily imagine Android being rebased on Fuchsia; the same may happen to servers as well). In result Linux and Linux-related things will become not interesting to most companies and they’ll drop their financial support.

You may say that there might be a scenario where Linux Foundation will concentrate on Linux-only stuff but I’m sceptical. AgStack is a clear signal they run out of ideas where to promote Linux but in accordance with Parkinson’s law they’ll still keep growing by inventing new stuff as long as there’s enough income to employ more people, organise more conferences and such.

As usual, I’ll be happy to be proved wrong.

On the Origin of Bloatware

Friday, June 11th, 2021

This is inspired by both a private discussion on why modern computing is so complex and my migration from Ubuntu 12.04LTS to systemd 20.04LTS.

Since I’ve finally changed from my less than ten years old operating system to something more modern I’ve noticed that it became noticeably slower (not irritatingly slower though but slower nevertheless) except for Firefox (which is probably not because of JS engine improvements but rather because of native execution of now supported APIs instead of polyfills). And trying various desktop environments before settling on Cinnamon I’m horrified by how bloated and unusable (to me) they are. My friends complain about modern technology demanding more effort to maintain because of complexity and weird interdependencies—while it’s supposed to make your life easier. So why it is like that?

For a keen reader the title of this post contains the answer. For the rest I’ll elaborate it below.

Why I still like C and strongly dislike C++

Wednesday, May 26th, 2021

This comes up in my conversations surprisingly often so I thought it’s worth to write my thoughts down instead of repeating them again and again.

As it is common with C programmers, C was not my first nor my last language, but I still like it and when I have to write programs I do it in C. Meanwhile I try to be aware of modern (and not so modern) programming languages and their trends and write my own multimedia-related hobby project in Rust. So why I have not moved to anything else yet and how C++ comes to all this?

ZMBV support in NihAV and deflate format fun

Saturday, May 22nd, 2021

As I said in the previous post, I wanted to add ZMBV support to NihAV, mostly because it is rather simple codec (which means I can write a decoder and an encoder for it without spending too much time), it’s lossless and supports various bit-depths too (which means I can encode various content into it preserving the original format).

I still had to improve my deflate support (both decompressing and compressing) a bit to support the way the data is stored there. At least now I mostly understand what various flags are for.

First of all, by itself deflate format specifies just a bitstream split into blocks of data that may contain any amount of coded data. And these blocks start at the next bit after the previous block has ended, no byte aligning except by chance or after a copy block (which aligns bitstream before storing length and block contents).

Then, there is raw format used in various formats (like Zip or gzip) and there’s zlib format used for most cases data is stored as part of some other format (that means you have two initial bytes like 0x78 0x5E and 2×2 bytes of checksum in the end).

So, ZMBV uses unterminated stream format: first frame contains zlib header plus one or several blocks of data padded with an empty copy block to the byte limit, next frame contains continuation of that stream (also one or more blocks padded to the byte boundary) and so on. This is obviously done so you can decode frames one after another and still exploit the redundancy from the previously coded frame data if you’re lucky.

Normally you would start decoding data and keep decoding it until the final block (there’s a flag in block header for that) has been decoded—or error out earlier for insufficient data. In this case though we need to decode data block, check if we are at the end of input data and then return the decoded data. Similarly during data compression we need to encode all current data and pad output stream to the byte boundary if needed.

This is not hard or particularly tricky but it demonstrates that deflated data can be stored in different ways. At least now I really understand what that Z_SYNC_FLUSH flag is for.