Archive for the ‘Useless Rants’ Category

Democracy and open source

Wednesday, May 7th, 2025

Since I have nothing better to do, I decided to finally write a rant about between democracy and open source movement.

I believe democracy is a flawed form of government—because people are flawed. There are two main problems here: first, the fact that people are mostly ignorant and don’t think about the decisions implemented by their representatives; second, people don’t often vote how they want. Yes, there’s a difference here—in the former case people don’t have an idea what their representatives do (nor care about the fine details of the legislation passed) and vote for somebody for such profound reasons as “I voted this way last time” (or even “we’ve always been voting for this party”), “at least it’s not X”, “just for fun”; in the latter case people may know what they want from the candidate and yet keep voting for the wrong person consciously. The reasons in that case may be even sadder: “I want to vote for a winner” (i.e. voting for the candidate who’s most likely to win because that candidate is most likely to win), “that guy controls the main business in our town so we have to vote for him”, “that guy gives out freebies” or “all media claim that’s the best possible candidate ever”. Side note: I’ve heard all the excuses listed here and not just in the recent news about USian elections. Essentially it boils down to two things: people either lacking control (over the information, or even over their own income; the mostly forgotten distributism movement had a lot to say about it) and people not caring about it at all (but still voting for some reason). And it seems to me that open-source movement is a lot like this.

Initially Free Software (free not as in “free beer” but rather like in Stallman’s speeches) was targeting the audience that understood it, namely the programmers and hackers that values the software freedoms and could exercise all of them (yes, including modifying source code and compiling the library/program). Nowadays though there are many users that actually do not care about the software they use as long as it solves their needs (and if not, they’ll either look for an alternative or start pestering the developer to fix it for them, instead of doing it themselves). Like with democracy, people are so used to its presence that they not realise why it matters and don’t care if something happens to it.

The second aspect is the lack of control. I develop software in the old way: I make it useful for me and provide the sources for the curious zero people who may do with it whatever the AGPL license permits. But most developers have to play by the rules of infrastructure providers in order to get their software noticed. In an exaggerated form, if your project is not on GitHub and does not have at least a thousand stars then it does not exist.

And if you got your software popular and included into some kind of distribution or package repository, that means kowtowing to the distribution (or package repository) maintainers. As one of the core libav developers I could observe the interactions between that project and certain Linux distributions. All I can say is that with the recent USian Securing Open Source Software Act and European Cyber Resilience Act they essentially get the same treatment as the developers got from them (but at least it’s more formal).

If you wonder why either is important in the first place, the answer is freedom. Without democracy you have no way to affect what’s forced onto you, without open source you have no chance of getting software for your needs instead of the vendor’s needs (which are usually diametrically opposite to your needs—like having full control over your hardware, wallet and all personal information they can siphon out of you and sell to the highest bidder).

If you wonder what can be done about it, there are two obvious solutions. People may actually start to think about what they’re doing (or choosing) and collect all available information beforehand. And not such utterly improbable solution involves global (thermo)nuclear war—no people mean no problems (or at least survivors will be more occupied with surviving instead of competing who has the latest iPhone model). The chances of the latter are rather good, let’s do nothing and wait.

My Rust experience after eight years

Saturday, March 29th, 2025

Soon it will be eight years since I’ve (re)started NihAV development in Rust. And for this round date I’d like to present my impressions and thoughts on the language—from the perspective of applicability in my experimental multimedia framework.

(more…)

A bit of class theory

Saturday, March 15th, 2025

There is a Ukrainian word “жлоб” (zhlob) with unknown origin, but probably coming from Polish żłób via Yiddish (and likely donating that meaning back, since in Ukrainian it means only a person while in Polish it’s not a main meaning and does not connect to it).

It is hard to give a proper translation for that word. It may mean a man large as ox, strong as ox, and with comparable intellect too. One dictionary claims it’s a synonym for a wealthy peasant who employs others. More often though it means a vulgar person with very limited intellectual needs and a miser. Kinda like an average person but without positive traits and with deficiencies magnified. I’m pretty sure that you can recognise such characters even without a strict definition.

And there’s Ukrainian living classic Les’ Poderv’iansky, often known simply as The Artist (he started his career as a painter but got much more famous as a writer and poet; he has given us many catchphrases, the best-formulated version of Ukrainian national idea and I’ve heard people quote his adaptation of Hamlet in full—people who would not normally read classics or memorise poetry).

Once he formulated the main reason why Karl Marx was wrong: people differ not by classes but rather by mindset. So poor zhlob shares interests with the rich zhlob and they can understand each other better (despite being a worker and factory-owner) than, say, a teacher.

Here’s an excerpt from his essay (published in the essay collection “Жлобологія” pp. 245-250 in 2013 if you care):

It would be nice to turn things around so zhlobs can’t get power. Unfortunately that’s utopia.

Sensible people, who are sadly too few, don’t have any influence, so zhlob ideology prospers. The prevalent class is zhlob, who is preserving his values. If this continues, this country will be completely fucked.

That’s because zhlobs are not fit for anything good but like awards nevertheless. […] They require proofs of own importance. They like various tchotchkes, posts and titles. Their hidden dream is to become counts and dukes.

Of course it had been written in Ukraine back in the day before people revolted and threw away one certain zhlob that other zhlobs voted for president, but the principles remain the same and can be applied to other countries. So if you ever wondered why rednecks vote for a hereditary billionaire believing him to be “our guy”, while hating another candidate despite her being “closer” to them—now you know the answer.

NihAV: hardware-accelerated playback revisited

Monday, March 10th, 2025

Recently I’ve made a mistake of upgrading one of my laptops to systemd 24.04. I can talk about various quality of life improvements there (like brightness control not working any longer so I have to evoke xrandr instead) but it’s so useless rant that it does not deserve to be written. What is worth talking about is hardware acceleration. Previously my player on that laptop had rather unreliable playback with VA-API decoding (i.e. it worked but some videos made it too laggy); now the situation has improved—it’s reliably unusable. Additionally it seems to consume only slightly less CPU than with software decoding. So I finally looked at the way to speed it up (spoiler alert: and failed).

Conceptually it’s simple: after you decode VA-API picture you need to obtain its internal parameters with vaExportSurfaceHandle(), create an EGL picture using eglCreateImage() (or eglCreateImageKHR()), blit it onto OpenGL texture with glEGLImageTargetTexture2DOES() and you’re done. It was not so simple in practice though.

Exporting descriptors is easy, I just modified my fork of VA-API wrapper to do that and it even seemed to produce correct output. The problems started with OpenGL integration. My player uses SDL2 so I spent a lot of time trying to make it work. First of all, it’s not clear how to obtain a proper OpenGL context for the calls, then there’s this problem of it being finicky and not liking multi-threaded execution. And of course you have to load all those functions mentioned above manually (because SDL2 offers only a small subset of all possible OpenGL functions—not surprising, considering how many of those are optionally supported extensions or missing in certain profiles).

Anyway, I threw away most of my player functionality, leaving just loading an input file, decoding it and trying to display only the first frame. It ended with a segfault.

It is probably because of (poorly documented) SDL2 wrapper doing, but it can’t provide a sane OpenGL context probably. So a call to eglGetCurrentDisplay() returns either NULL or a pointer that looks like a small negative value; the same happens with eglCreateImage() (fun thing that eglGetError() returns the same value, -424 if somebody is curious); at glEGLImageTargetTexture2DOES() call it finally segfaults.

At that point I actually tried searching for some alternative crates that would allow me to create an OpenGL window and call those functions—and found none fitting the purpose. They all are either provide rather limited OpenGL subset which is enough for drawing a triangle with shaders (and probably expect you to locate and load the library with the additional functions by yourself) or even simply provide OpenGL bindings, leaving even window creation to you.

In other words, not my kind of fun. I’ll simply disable hardware acceleration for all cases there until I find a better alternative.

P.S. Vulkan is not really an option either. My hardware is too old for ANV driver (and for HASVK driver too).

A bit on USA

Sunday, February 2nd, 2025

As usual, I don’t have to say what has been put to words in much better way by somebody else already.

At the banquet, last winter, of that organization which calls itself the Ends of the Earth Club, the chairman, a retired regular army officer of high grade, proclaimed in a loud voice, and with fervency,

“We are of the Anglo-Saxon race, and when the Anglo-Saxon wants a thing he just takes it.”

That utterance was applauded to the echo. There were perhaps seventy-five civilians present and twenty-five military and naval men. It took those people nearly two minutes to work off their stormy admiration of that great sentiment; and meanwhile the inspired prophet who had discharged it—from his liver, or his intestines, or his esophagus, or wherever he had bred it—stood there glowing and beaming and smiling, and issuing rays of happiness from every pore—rays that were so intense that they were visible, and made him look like the old-time picture in the almanac of the man who stands discharging signs of the zodiac in every direction, and so absorbed in happiness, so steeped in happiness, that he smiles and smiles, and has plainly forgotten that he is painfully and dangerously ruptured and exposed amidships, and needs sewing up right away.

The soldier man’s great utterance, interpreted by the expression which he put into it, meant, in plain English—

“The English and the Americans are thieves, highwaymen, pirates, and we are proud to be of the combination.”

[…]

The initial welcome of that strange sentiment was not an unwary betrayal, to be repented of upon reflection; and this was shown by the fact that whenever, during the rest of the evening, a speaker found that he was becoming uninteresting and wearisome, he only needed to inject that great Anglo-Saxon moral into the midst of his platitudes to start up that glad storm again. After all, it was only the human race on exhibition. It has always been a peculiarity of the human race that it keeps two sets of morals in stock—the private and real, and the public and artificial.

And here’s a link to the full writing in case you haven’t read it already.

Professional metric benders

Tuesday, January 28th, 2025

Today on “things that Kostya cannot change so he rants about them instead” is something different from the usual political or open-source political rants.

There are several groups of people whose occupation is (in theory) to evaluate certain things. So (again, in theory) you can call them the metric for those things. In practice though they rather do the opposite and try to make things conform to the valuations they give, or at least to make public perceive those things in the way that would confirm the original claims (and truth be damned!).

Of course some would see nothing wrong with it, others would even try to tell you that they’re always right because they cannot be wrong and thus only their opinion is the true one. Well, I’ll present three examples and see for yourself.

Let’s start with the most prominent example, namely lawyers. In an ideal world, lawyers are a part of judicial process, making sure that the side they support is represented fairly—and that means that judging is done according to the laws, without glaring mistakes or prejudices. In practice though lawyers tend to get associated with paid justice, meaning that quite often the outcome of the trial or litigation depends on the pay-grades of the lawyers involved instead of actual known facts (or even laws). Which sometimes leads to the fun systems as the British one with two mandatory kinds of lawyers (barristers and solicitors) and USian one—resembling quantum dynamics—where you can call lawyers elementary particles responsible for any interactions between entities (except that quantum dynamics is easier to comprehend).

Then there’s another often disliked group of people called philosophers. In theory philosophy is a way to explain the world or some of its aspects. So one would expect philosopher to be a thinker who studies the world (or part of it) and makes some conclusions about how it works and what implications that gives for the rest of the things. For instance, science may study human morals as a thing emerging in collectives and affecting interactions between members of those collectives, while philosophy may ponder how morality defines human itself and what should be considered the ideal moral. But the modern philosophers seem to work in the reverse: first they start with a conviction (quite often a small one and benefiting them directly) and work up from that to build a system that will provide an excuse for their beliefs. Of course this is unlikely a modern trend, but history preserved enough examples of real philosophers for any epoch and different countries as well—which is hard to say about the modern world.

And finally art critics. One would naïvely expect them to be people with certain tastes who appraise certain kinds of art (paintings, sculptures, books, movies, video games etc etc) and tell public their opinion about it. You may like them or not, agree with them or not, but in either case such reviews should not merely give an abstract score but also provide an explanation of what was done right, what could be improved, and what hidden qualities may make it even better than your first impression told you. There’s a reason why people still remember and quote Roger Ebert (of Chicago Sun-Times) or Scorpia (of Computer Gaming World). But the majority of the modern art critics seem to start from the premise of having to praise the reviewed product (often, apparently, out of fear for the salary and other benefits—disappointed owners of the badly-reviewed product may stop advertising in your media or provide early access to the next products they release and so on) and construct up the review leading to the goal without mentioning the actual reasons. What makes it worse is that often it’s accompanied not by the notion “if you love X and Y then this is definitely a thing for you, otherwise you may want to skip it” but rather “if you don’t like it you’re a dumb bad person”. Thanks, I still remember a bit of Soviet Union to reject it at a visceral level.

So there you have it. Of course this effect is nothing new but I felt that for some reason I need to say it, so here it is.

Call for a new container format!

Friday, January 17th, 2025

Sometimes I remember that Matroska exists and today I also remembered how it came to existence. Its author proudly admits mixing all the buzzwords of Web 2.0 era like XML, Semantic Web etc etc and coming up with that format. Since we’re in Web 3.0 era we should have something more modern.

That is why I’m calling for a modern multimedia container format to supplant the outdated formats of the old. It should encompass all the features that make the modern Web great:

  • binary JSON as the backbone of the format;
  • central repository for the downloadable descriptions of the parts of the format (but not codecs themselves! Think of it as of MXF specification if it helps);
  • blockchain (as well as clusterchain and framechain);
  • metaverse integration;
  • decentralised storage (so that the container may refer to some data in the cloud as well as on the local disk; even MOV could do something like this);
  • and of course AI!

Some of you may ask where AI can be applied in this scenario. The answer is obvious—transforming input data for better compression (let alone generating metadata or enabling better integration with other Web 3.0 products). A good model should be able to do the same savings like Matroska did by e.g. shaving off common headers bytes from each frame but without special mapping. An excellent model may generate the content by the embedded description instead of transmitting AV2 video. And of course the central repository will contain the description of models and parameters to be used (in addition to the descriptions of better representation of container parts layout). The possibilities are limitless!

Proposals should be sent to the Alliance for Open Media , I have worse things to deal with.

On the sorry state of opensource multimedia

Wednesday, December 25th, 2024

I’ve been wanting to write this post for a long time, with a focus on the difference between hobby project and product and about NihAV only. But a recent FFdrama made me re-think both the structure and the conclusions.

Apparently there’s another surge of developer’s discontent in jbmpeg for having mushroom treatment (not for the first time and probably not for the last one). IMO they need to realize the project is as free and democratic as Soviet Union and you need simply to agree to the things proposed by the General Secretary (definitely not the leader)—that would save time and nerves of everybody involved. As I wrote countless times before, I do not fear for the future of that project as it can keep such existence indefinitely long and here I’ll try to present my reasons why.

First of all, revolution a la libav won’t work—Michael has learned the lesson and he won’t be kicked out again (not that it really worked in 2011 but now there are no chances for that).

Second, if you split out and form an alternative, it has not so many chances to replace it. And if you decide to write anything from scratch your chances are next to zero. The rest of this post is dedicated to answering why.

Recently I re-read The Mythical Man-Month which tells not only about the author’s experience designing RedHat OS/360 but also presents more general observations and ideas. And right in the beginning he talks about the difference between a program, a programming product, and a programming systems product. Essentially, a program something a programmer writes and that it works for him on his system; a programming product is a program with documentation and support; and a programming system product is that works as a component in a larger system. And moving from one stage to another requires an effort several times larger than the previous one (I’m simplifying a lot and probably misremember something—so you’d better read the original book, it’s a worthy reading anyway).

Here we have a similar situation: writing a tool just to do things for you is straightforward, even I have managed to do it with NihAV; making it into a product requires offering much wider support for different platform configurations (for example, my videoplayer has VA-API hardware decoding enabled by default while it’s not available, say, on Windows and you need to switch that feature off there in order to build it) and different features (e.g. nihav-encoder works for testing encoding per se, but lacks ability to encode input into a good intermediate format supported by other players and encoders). And it gets even worse if you try to make it into a library ready to be used by others—beside the usual things like documentation you’re expected to guarantee some API stability and a certain level of quality. So while I may not care that my app panics/crashes in certain circumstances, it’s significantly less forgivable for a library. And of course achieving such quality level requires a lot of unexciting work on small details. Debugging is even worse.

Suppose you decided to create a fork and work from that. Then you still have a much worse position—you may have the same codebase but there are no killer features you can offer and you don’t have recognition. libav managed to succeed for a while since it was supported by some distribution maintainers—and even then users complained because the de facto brand name was replaced with some unknown thing. And I guesstimate 40% of current jbmpeg developers contribute to it in order to upstream the changes they make while using it in their employer’s product or pipeline. So how can you convince those companies to use your fork instead? And that’s not taking patent situation into account which makes a substantial support from any large company for your project rather improbable.

Good thing I’ve never intended NihAV to be a competition, but what about other projects? rust-av died because of lack of interest (Luca claims that he started it mostly to learn Rust and see how performant it can get—mission accomplished, no further development required). librempeg fares better but I doubt that Paul wants to deal with all the demands that other parties make for the honour of your stuff being included into their distribution (or being used without even a credit).

Another thing that needs to mentioned is that multimedia is no longer an attractive field. Back when I started to dabble in the field, it was rather exciting: there were many different formats around—in active use as well, and people wanted to play them not only with the proprietary players. There were libraries and players supporting only a specific subset of different formats, like avifile or libquicktime or DVD-only player. Nowadays it’s usually a combination of H.26x+AAC in MP4 or VP9/AV1+Opus in WebMKV, all formats having specifications (unless you lack Swiss Francs to pay for the ones from ISO) and new formats are not introduced that often either. Of course, we might have H.267 standardised soon but who uses even H.266? When was the last time you heard AV2 development news? The codec was supposed to be released a couple of years ago, did I miss it along with AV3? Do you remember Ghost audio codec from Xiph? Of course Fraunhofer will keep extending AAC patent lifetime by inventing new formats and naming them like OMGWTFBBQ-AAC but who really cares?

That is why I believe that no matter how dysfunctional jbmpeg is, it will keep existing in this undead state indefinitely long as it’s good enough for the most users and there’s no compelling reason (e.g. new popular formats or radically different ways to process data) to switch to anything else. The only winning move is not to play.

To NIH or not to NIH

Sunday, December 22nd, 2024

Paul of librempeg fame informs me about his achievements occasionally (and in my turn I try to remind the world time from time that this project exists and may provide you functionality hardly found elsewhere, like various filters or console formats support). And his recent work was implementing inflate routine specifically for the multimedia needs. This made me think if it makes sense to have a custom implementation for deflate decompressor and packer in multimedia project when zlib exists and I think it makes perfect sense. Of course in NihAV I NIHed it because the project concept demands it and it was a nice exercise, but in more serious projects it makes sense too and below I’ll try to explain the reasons.

Zeroth of all, a quick reminder about flexibility. RFC 1951 merely specifies the format, so the implementations can output varying bitstreams that will correctly decompress the data. Back in the day when I worked on it I mentioned how you can compress data in different ways, dedicating more time to achieve better compression. And there are more tricks not mentioned there like parallel compression.

Now, the first reason why to have your own implementation is that you can adapt it to your custom needs. As Paul demonstrated, inflating data into a custom format might be beneficial. If all you need in 99% cases is unpacking data from one buffer into another or to a frame buffer (which has padding so you have to output, say, 27 bytes, then skip 5 bytes, output 27 bytes and so on) you can do without all additional functions and not bother with a sequence of calls to partially inflate data.

Then, there’s a question of consistency. You cannot be sure that a new version of zlib will produce the same output as its previous version (I vaguely remember that there was a small scandal when GitHub releases were re-generated using different deflate implementation and resulted in a lot of headache because of old archive hashes being no longer valid; or there’s a story of some Linux distros replacing zlib with zlib-ng resulting in failed tests; and even “no compression” format may change apparently). The case with liblzma is probably a good demonstration of other reasons why it’s not always wise to rely on third-party components.

And finally, you can not merely adapt interface to your needs, you can tune it to handle your data better too. There’s a reason why there exist compressors targeting e.g. genome data. So when you compress image data, it may be beneficial to search for matches around the position right above the current line first, and the presets for the compression levels may be tuned to the different sets of trade-offs. After all, deflate is often used in screen capture codecs where real-time performance is more important than the compression ratio. But who can imagine people tinkering with the encoder trying to improve its performance in a multimedia project?

I hope this helped to convince you that there are circumstances where NIHing something may prove worthy. As a rule of thumb, if it’s easier to implement yourself than to re-use an existing library then maybe you should do so. That it the right way, and the other way lies left-pad.

Some words about Alien Isolation CDA2

Wednesday, December 11th, 2024

Some of you might’ve heard about a Finnish adventure game called Alien Incident. Apparently it had a CD release with intro and final cutscenes re-done as an animation format. Of course that’s enough to draw my attention.

It turned out that it has several features making it one of the most advanced game formats: multiple soft subtitles and checksums for each frame. The file header is followed by the text block with subtitles in several languages that are rendered by the player over the video (and the font itself is RLE-compressed). Then there’s a table of 32-bit checksums for each frame, and only after that there’s frame size table followed by actual frame data.

Video is coded in 32×40 tiles with one of several coding modes: read full tile, update marked pixels in the tile, update 2×2 blocks in the tile, or update 4×4 blocks in the tile. That data is further compressed with slightly modified LZSS. Additionally frames carry audio data and some additional commands such as fading or setting frame duration. By default player operates on 70 or 75Hz clock and each frame is displayed for several ticks (usually 5-8). So a special command in the frame data tells player to display each frame for N ticks until further notice.

And now it is time for a rant. There is an issue uncovered by decoding these files. Both files are long (finale is over 11 minutes at 18.75 fps) and have lots of palette changes (because of both scene changes and fading)—and those two things helped uncover a problem with FFmpeg. Its AVI demuxer is the problem: it scans index, finds palette change chunks (which I explicitly mark with AVIIF_NO_TIME flag), adds them to the video stream statistics (despite the flag), concludes that video stream has significantly more frames than audio stream and switches to non-interleaved mode; in that mode it disregards actual index contents and treats palette change chunks as video data. Personally I don’t care because I have my own set of tools for decoding, transcoding or playing videos that does not have this problem, but considering that virtually every other piece of software uses libavformat for handling input data, that may pose a problem for everybody else (I can afford to not care but somebody else would have to change perfectly valid output just to work around third-party bug). This is a fun case demonstrating that monopoly is evil, even if it’s a monopoly of open-source software.

P.S. Probably it’s a good occasion to remind that librempeg exists and probably both it and you can benefit from your attention to it.