BTC is bullshit

September 4th, 2023

Don’t get me wrong, the idea behind it is sound but it got overhyped and misapplied. Of course I’m talking about one of an early attempts at lossy image compression called block truncation coding, what did you think about?

For those who forgot about it (and is too lazy to read its description), the method replaces values in a block with two values below/above block mean value using that mean value, standard deviation and that number of values that were above the mean value. This algorithm is often quoted as being the one on which many video codecs (especially the ones used in various games but some standard ones like Micro$oft Video 1 and A**le Graphics(SMC) and Video(RPZA) as well) are built. And that part is a bullshit which many used to believe mostly because nobody who heard it took any time to evaluate that statement (I was no exception).

Actually I should’ve started doubting it much earlier as I’ve tried to apply it to colour quantisation (like in Video 1 encoder) and failed. The method is applicable to scalar values only (and pixels are vectors of three components in our case, you can map them to greyscale but how would you calculate two distinct colours to segment block into?) and its results are worse than using Linde-Buzo-Gray method for vector quantisation (which was presented in the paper the following year). Wikipedia has an article describing a proper image compression algorithm proposed in 1986 called color cell compression that definitely looks like the perfect candidate for all the following codecs: it describes compressing image by splitting it into 4×4 tiles, grouping pixels in those tiles using average luminance as the discriminator and calculating two colours to paint the tile as averages of those two groups. That’s how vector quantisation works and unlike BTC it does not require calculating square roots and can be implemented trivially using integer maths only. So it’s practical, gives better results (in terms of MSE of greyscale images when compared to BTC) and works on actual pixel data.

While BTC was innovative for its time and probably an important stepping stone for further methods, its relevance to the modern compression schemes is minimal (unlike colour cell compression) and calling it the base for the codecs with two-colour vector quantisation is as stupid as calling Cyrano de Bergerac the father of space flight because he mentioned travel to the Moon using gunpowder rockets in a novel of his.

NihAV: adding SGA support

September 2nd, 2023

Since I had nothing better to do this week I decided to finally add Digital Pictures SGA decoding support to NihAV. While there are many different formats described in The Wiki, I’ve decided to play only those not described there (namely $81/$8A, $85, $86 and $89).

In my previous post on this matter I mentioned that the formats I took interest in are using 8×8 tiles that may be subdivided into 8×4 or 4×4 parts and filled with several colours using a predefined pattern (or an arbitrary one for 8×8 tile if requested) plus some bits to select one of two possible colours for each tile pixel. The main difference between $81/$8A scheme and the others is that it codes all data in the same bitstream while the later versions split colours and opcode+pattern bits into two separate partitions (maybe they had plans for compressing it?) plus they store audio data inside the frame.

And here are some notes on the games (I think most of those are PC or Macintosh ports but it’s possible the same files were used in console versions of some of those games as well):

  • Double Switch—this one uses $81 compression (in still images, cutscenes embed them along with $A2 audio in $F1 chunks);
  • Quarterback Attack—this one uses $8A compression in $F9 chunks;
  • Night Trap$85 compression and megafiles (i.e. almost all cutscenes are stored in single NTMOVIE file that require some external index to access them). Also the PC release had a short documentary about the moral panic around that game (in the same format of course; in two resolutions even);
  • Corpse Killer$86 compression and one megafile for all cutscenes;
  • Supreme Warrior$89 compression, one megafile and no frame dimensions given. For most of the cutscenes it’s 256×160 but at the end (logo and maybe something else) it’s different. Additionally there are two audio tracks: some audio chunks contain twice as much data (and have high bit of size set), in that case the first half corresponds to English speech and the second half is Chinese; otherwise it’s the same for both versions (e.g. background music, fighters grunting, sound effects and so on).

Overall, it was an interesting experience even if I don’t care about the games themselves.

Tell me how you format output and I tell you what programming language you are

September 1st, 2023

Since K&R times it’s traditional to make a programming language print “Hello, world!” as the introduction to it. Recently I had a thought that formatting a number in output tells you the essentials of the programming language itself.

So if you want to print not just a constant string but, say, a number expanded to a certain length (e.g. printf("%4d\n", val);), how can you do it?

Essentially there are three major approaches: using WRITE with optional specifiers for each element, using one format string with arguments to follow and having just basic output with utilities to format elements using some pattern and string concatenation. And there’s C++.

The first approach can be found in the old venerable languages like FORTRAN or ALGOL. It is also somewhat funny that in Pascal its writeln() can’t be implemented in the language itself so the compiler has to convert it to a sequence of formatting and writes or string concatenations (exactly the third way). The same is true for Rust as well (which belongs to the second category) but at least println!() outright tells you that it’s not a normal function but rather a macro or a compiler plug-in.

The second approach even if not invented got popularised by C. You have a template string with recognizable percent signs to signal where arguments should be substituted and in what form. Of course this is not the safest approach since you may pass a template with not enough arguments and read garbage from the stack (and that’s not counting the dreaded %n). In the recent years another template format got popular (probably thanks to Python, with an influence from bash or PHP): instead of percent sign and type there are braces with an optional format arguments or variable name inside. Speaking of Python, you can use print with the old printf-style or new f-string formatting and it will put a newline at the end by default (reminding once again that Python is all about whitespaces). As for the implementation, in some languages it’s a built-in function (i.e. you can’t implement it in the language itself but the compiler/interpreter takes care of it), in C it’s implemented by parsing the stack more or less directly, in many modern languages variable-argument function simply passes those arguments as an array of the most generic object types (so you can implement something like printf() yourself). I think I can also mention here the third pattern format mostly used for formatting binary output—the one used in Perl pack() function (and whatever languages borrowed it), its syntax reminds me of the classic languages from the first category.

The third approach is too primitive so usually languages try to use some syntactic sugar to make it easier to use (see Pascal or Rust mentioned above). I should probably mention INTERCAL where outputting a single character might pose a challenge for an uninitiated but that’s part of the charm of the language.

And finally C++. In the original version of the language you were supposed to use cout << setw(4) << val << endl; and while it demonstrates the advantages of the operator overloading, in the same time it suffers from the extreme verbosity and side effects (that width modifier will be applied to all the following argument in all following prints until you change it). In the modern C++ you should use std::cout, std::setw() and std::endl() instead. As the result people mostly use printf() or some third-party formatting library (you may even get it with one of your string class implementations for free).

As you can see, the way you can print a formatted number (both the syntax and the implementation) can tell you a lot about the programming language in a rather short time. Of course there are exceptions but the lack of such functionality tells you at least about the kind of language you're dealing with.

Looking at Tex Murphy video formats

August 30th, 2023

Since I have nothing better to do, I decided to look at the formats used in various Tex Murphy games.

So here they are (the game from this millennium does not count):

  • Mean Streets—it uses raw images in ILBM-inspired format (so that even some chunk names are still the same) grouped with e.g. palette files and packed in ZIP using ancient compression methods (shrink and reduce, I wonder if anything but ancient PKZIP.EXE can unpack them all);
  • Martian Memorandum—the only VID file present looks like mostly raw image data interleaved with some metadata to tell which regions to update (though judging from ScummVM sources it may also employ RLE for sprites);
  • Under a Killing Moon—this one has new format starting with 32-bit size followed by signature “PTF” and after 64-byte header I see the suspiciously familiar chunk structure: 32-bit size, 16-bit signature 0xF1FA that contain some sub-chunks of its own… I refer you to The Multimedia Mike, he should be able to recognize the format;
  • The Pandora Directive—this one uses format that starts with “H2O” and seems to be Huffman-compressed RLE;
  • Tex Murphy: Overseer—this one decided not to be creative and used Smacker files (along with some FLICs).

So, despite its nine-year history (again, only the games from last millennium) those games have changed the format used for video segments yet they have not moved further than RLE.

H.264 decoder postmortem

August 27th, 2023

I mentioned before couple of times that NihAV has its own functioning H.264 decoder. And after my failed attempts to use hardware accelerated decoding instead, I spend some time trying to optimise it but eventually gave up. On one hand it’s fast enough for my needs, on the other hand it’s too tedious to optimise it further (even if I can spare time on it, I’d rather not).

To put it into perspective, initially it was about three times slower than libavcodec one without SIMD optimisations, now it’s only about two times slower (with SIMD turned on it’s about five times as slow, feel free to laugh at me). But at the same time playing 720p content (and I have next to no files with larger resolution) in multi-threading mode takes 20-25% of the core so it’s not that bad.

So how the cycles are wasted and is there a potential for serious optimisation?
Read the rest of this entry »

A moral dilemma

August 22nd, 2023

Disclaimer: the question presented in this post does not affect me in any way but it’s still perplexing enough to make it public. Also please note I don’t mention names (neither of people nor software) as this post is not about shaming them.

As any other area of human activity, multimedia has its own share of, ahem, eccentric people that are obsessed a bit too much about their project. For example, there’s a guy who constantly informs the world about even the slightest advancements of his image codec with his own unique image quality metric. Theoretically it should be interesting but the author keeps ignoring the useful advice he gets (like making his code work with a different image size or explaining how his metric is better than the others) in hope that somebody else will get interested enough to fix that for him, as his lack of free time prevents him from doing anything but minor improvements to the code. Maybe it’s not the wisest approach but it does not harm anybody so good luck to him, maybe one day there will be a breakthrough and it will get at least a limited fame. But there’s another example that came to my attention recently, which is significantly more disturbing.

So there’s a certain codec that has niche popularity for its speed and decent compression ratio. Since it was proprietary and somewhat popular, a certain person (not me) has reverse-engineered it and added decoding support for the format to FFmpeg. The reaction from the creator was rather baffling at the time, it was like he felt the control of that codec was wrestled from him. Oh well, enough time has passed with no other issues arising. But last week the same person who REd the decoder announced about that he’s working on an opensource encoder for it, and that’s when the situation exploded. The format creator in a tone that I think is called passive-aggressive told that it’s essentially a stolen work and that it made him stop on a new version of the work. And what is significantly worse and greatly disturbing, his words sound like he got a depression from it or even suicidal thoughts. Even while I have reasons to believe that the encoder in question is going to be an original work (i.e. not a plagiarism; REing format to ensure compatibility is also permitted by the law in many countries) the possible consequences are still deeply disturbing, to say the least.

Thus several questions arise: what should be the best course of action to resolve situations like this one? Was an opensource implementation even for a decoder a mistake and should it be removed entirely? Should the author better communicate his wishes that there should be no alternative implementations whatsoever in the first place and should the others honour it if the product becomes too popular for its own good? Even if the law permits it, what about the morals?

I can only be happy for the fact that I’m not involved in it at all. In either case it would be nice to know the answers—and even nicer is they will never ever be useful.

P.S. Corporations are not people so do not try to project the situation to them. And if they feel offended their lawyers will tell you that (so far I think only N*llym*ser tried it).

P.P.S. If you think I also suffer from similar psychological issues—maybe you’re right, I’m not be eligible to judge. At least I do not try to force my stuff onto others, I don’t even post anything at public places except in this very easy to ignore blog.

The original CfL codec

August 17th, 2023

As most of you don’t know and don’t care, modern advanced video codecs may use the special prediction mode called “chroma from luma” where, as it’s obvious from the name, the chroma components are reconstructed from the luma using some coefficients. And what do you know, I’ve found a codec that used this approach back in 1997.

So there’s a French company called Kalisto Entertainment and back in the day it developed a codec for the cutscenes in some of its games (at least Dark Earth and Nightmare Creatures). 15-bit RGB video is split into three components and each is coded separately using simple LZ77-like method (i.e. it’s either RLE mode, or copy with an offset from the current or previous frame). The twist comes from the fact that those components are split into tiles (usually 20×20 ones) and each tile has coding mode and two sets of scale/offset coefficients, so for each tile one of RGB components is selected as the base one and two others are coded as the differences from the scaled (and offset) base value.

So one component plane contains the base components for each tile (which may be different for each tile) and the other two contain the differences for the predicted non-base components (which, at least in theory, should be mostly zeroes and thus better compressible). So when some people wonder if it’s time for video codecs to perform optimal component decorrelation on per-frame basis, here’s a practical codec from the last century that did it per-tile.

On good russians

August 7th, 2023

As I mentioned in one of my previous posts on this topic, I consider russians to be rather a viral mindset than a nation: they don’t have clearly defined territory (as they consider they whole world to be it), they have no own culture (it’s either stolen or for-display set pieces that get no relation or acceptance with russians; well, you can argue that their widespread prison subculture is their own but how does that make things any better?), they lack human qualities like compassion required in this age (i.e. a millennium or two ago russians would be no different from many other nations but the times have changed), and the worst of all—they try to convert everything they come in contact with into russia (either by conquering the territories and committing cultural and actual genocide or by demanding that the other countries do everything their way because it’s too hard for russians to learn other country language and customs). And yet there are naïve people believing that there are “good” russians even that’s an oxymoron. Usually that comes from a belief that if russian say they’re against war and current government that implies they’re against other things as well. Here I’ll try to stratify russians by their empathy and activity:

  • plankton—like the namesake those russians have almost no will of their own and are merely flowing with the currents. If you ask them about their position, it will be more “for all good and against all bad” and they never take interest in politics. They’re always supporting the government but when it changes they’ll support the new government equally half-heartedly (just remember what happened in Rostov during the laughable coup attempt). The sad thing is that they’ll readily resort to violence if they’re permitted by the authorities. They have no compassion (there’s enough evidence how they’ll cheer to the war crimes their army commits and to the spectacle of the same war criminals dying as long as it’s a good show). Slave-owners and dictators may call them good but since I’m neither I can’t;
  • moths—those are almost the same as the previous category but they can have their own opinion not fully in line with the mainstream one and even—gasp—be against the government. The only problem is that they don’t act on their words, saying that they’re mere moths and can’t do anything. Of course they’ve never tried to find it out if they can actually do anything or not. For those I have only mild despise because they don’t deserve a strong emotion;
  • sell-outs—not originally russians but who became ones usually by being seduced with russian money. You might’ve heard about russian actors like Steven Seagal or Gérard Depardieu. There’s nothing wrong with coming to another country in pursuit of better work opportunities. But there’s a difference between selling your skills and selling your dignity and calling such people good is like calling a native advertisement a good article;
  • chameleons—those russians actually have a position, the problem is that it changes depending on circumstances and who asks about it. To give a concrete example, in the first days of February 24th some propagandists from russian television said to their friends that they wear their half-swastika symbols but mean it as a hidden support for Zelensky. Obviously they may voice support for anything but in reality they’re concerned only about their own well-being. Somebody not familiar with that feature may call them good, not knowing that the words they heard were empty, I call them disgusting;
  • white coats—that is a semi-official term for those who put themselves over the others like in the famous scene from Monty Python and the Holy Grail: —Must be a king. —Why? —He hasn’t got shit all over him (I vaguely remember a russian joke about somebody first spraying shit over others and then appearing in a clean white coat, that must be the origin). Such people pose themselves as morally superior to others, impartial judges and so on—forgetting they have no ground for that or trying to proactively evade the possible blame. As with the previous category, somebody naïve enough to believe their words without checking their background may believe they’re good but they should know better;
  • armchair Napoleons—those do not even try to hide their ambitions. In the “worst” case they want russian army to be better equipped and fighting (forgetting that russian traditions like stealing and corruption prevent it), in the “best” case they want russia to conquer all of the world. Or if it all fails they’re fine with russia nuking the rest of the world. You need to be a psychopath to call them good (again, search elsewhere);
  • and finally russians with human faces. Those pretend to be actual humans and often serve as an example of “good russians” in the West. In reality though sooner or later they show their real russian face. They might be against the current government but what are they going to do? There’s a post by some random russian that sums it up the best: “This regime will fall, Navalny will become a president and restore the country. And then we’ll get back at you, Ukrainians!” You can dismiss it as being just a single deviant voice, but in reality prominent figures from russian “opposition” demonstrated the same behaviour and chauvinism as the officials (like spreading the false claims about blown up Kakhovka HPP or not understanding why not everybody would like to interact with russians in general). To me it seems that they try to maintain the usual russian imperialism by keeping up ties with other countries (so when russia has more strength it can come there “in order to protect oppressed russian-speaking citizens”). So they’re about as good as telemarketers are your friends.

If you think this does not apply to somebody specific, try to get the honest answers for the following questions: are you against the current war? are you against the war just because it inconveniences your life (with sanctions and possible partial mobilisation)? do you think only the russian government is responsible for starting the war? do you think that russia should be held responsible for the war crimes it committed (e.g. paying reparations)? do you think that Crimea belongs to russia? do you think that russia should withdraw its forces from all occupied areas (Abkhazia, Armenia, Belarus, Ossetia, Syria) as well? do you think that something substantial should be done about russia in order not to make this scenario repeat again? do you consider the idea of dissolution of russia in order to make national republics acceptable? do you understand that russian writers often followed imperialistic agenda and thus other nations have reasons to ban their works? and finally, do you agree that russians are not superior to other nations? Hint: not all of those yes-no questions have “yes” as the right answer so you need to think before answering them.

russians usually give themselves away by starting to cry that Crimea is russian, always has been and giving it to Ukraine was a historical mistake as the existence of Ukraine itself. On the other hand, if somebody passes the test perfectly then probably you’re not dealing with a russian at all.

Meanwhile the only real good russians are mentioned in reports like this one:

NihAV: giving up on hardware acceleration

August 3rd, 2023

After having several attempts on trying to add hardware-accelerated decoding support for NihAV I’m giving up, the reason being the sorry state of it in general.

I’m aware of two major APIs for hardware-accelerated video decoding for Linux, those are VDPAU and VA-API. Plus there are some specific toolkits e.g. from Intel but from what I remember those are even more complicated.

So, VDPAU has only bare-bone documentation without actual explanation what is expected for each codec in order to decode it. VA-API turned out to be even worse: it points out to 01.org for documentation which no longer exists (and redirects to some Intel’s page blurbing how great they are at open source). And web.archive.org shows that that page essentially contained a link to libva and libva-utils repositories plus some references to the projects that have VA-API support implemented. “…so shut up and go away” was not written but implied.

At least VA-API has three crates implementing its bindings in Rust and not just one not updated in four years like VDPAU but how usable are those? There’s FeV that seems to support JPEG decoding only (and has a strict warning against AMD GPUs), there’s libva-sys that is a pile of auto-generated bindings and there’s cros-libva. The latter seems to be the cleanest one and most actively developed (too actively developed to my taste as it changes base APIs every couple of months). Unfortunately it’s still not exactly clear how to use it for H.264 decoding (and the cros-codecs crate provides equally confusing API). And the final straw is that it seems to be intended for single-thread use only by design, which means it’s not possible to use with my library (e.g. my player uses separate threads for audio and video decoding, so I can’t use the standard decoder interface for hardware-accelerated decoding without some ugly hacks).

Oh well, I’ll work on improving my own H.264 decoder performance—while it’s not much fun either at least it’s clear what I can do with it and how it can be done.

P.S. This reminds me of the situation with ALSA. From what I heard it’s one of the worst documented subsystems in Linux with too flexible interface, to the point that it took help from ALSA developers to make at least MPlayer support ALSA output. The most probable reason is that it’s common to smoke weed in Czechia (where ALSA was developed), but what is the excuse for the other libraries?

Why I work on NihAV

July 30th, 2023

I started NihAV as a more or less toy project to play with different concepts and try new stuff like finding out how vector quantisation works or attempting to write an encoder. Having enough experience with libavcodec and libavformat, I did not want to touch them again (and still don’t) and there was a hope that rust-av will provide a viable albeit limited alternative for multimedia playback (it still hasn’t). In theory I’ve achieved my original goals—NihAV supports decoding a lot of exotic formats (some of which are not handled by any other open-source project), it even has some encoders and its own transcoder tool and there’s even two players (one for audio files, another one can also play videos). So I could relax and do something else entirely but yet I’m working on adding new features to NihAV that take a lot of effort and do not bring me joy. Why?

Read the rest of this entry »