So, for a last month or even more (it feels like an eternity anyway) I was mostly trying to force myself to write a MPEG-4 ASP decoder. Why? Because I still have some content around that I’d like to play with my own player. Of course I have not implemented a lot of the features (nor am I going to do that) but even what I had to deal with made me write this rant.
(more…)
Archive for the ‘Useless Rants’ Category
Woes of implementing MPEG-4 ASP decoder
Friday, October 11th, 2024Looking at Winamp codebase
Friday, October 4th, 2024Breaking news from the Slowpoke News Channel™: a source code base for Winamp
has been released (just last month). So it’s a good occasion to talk about it and what interesting (for me) things can be found in the third-party libraries.
I think I used the software a bit back in the day when MP3 was still rage (and there were CDs sold proudly featuring MP3 at 256kbps) and I still was using Windows (so around 1998-1999). I even briefly tried K-Jofol player (does anybody remember that?) but I didn’t see much point in that. About at that time I finally switched to Linux as my main OS and started using XMMS
and XMMS2
(I actually met one of its developers at FOSDEM once—and saw a llama or two when I visited a zoo but that’s beside the point). Also there was a plugin for XMMS2
that added VQF support (again, nowadays hardly anybody remembers the format but it was an interesting alternative; luckily Vitor Sessak reverse engineered it eventually). But with the time I switched to MPlayer
for playing music and nowadays I use my own player with my own decoders for the formats I care about (including MP3).
But I wanted to talk about the code, not about how little I care about the program.
First fun thing is that the source code release looks like somebody was lazy and thinking something similar to “let’s just drop what we have around and tell not to do much with it—it’ll create some hype for us”.
Second fun thing is that it fails to live up to the name. As it should be obvious, the name comes from AMP
—one of the earliest practical MP3 decoders (the alternatives I can remember from those times were dist10
or the rather slow Fraunh*fer decoder). And of course WinAMP uses mpg123
for decoding instead (I vaguely remember that they switched the decoding engine to the disappointment of some users but I had no reason to care even back then).
But the main thing is that they’ve managed to do what Baidu failed to do—they’ve made VP5 decoder and VP6 codec open-source. Of course it may be removed later but for now the repository contains the library with the traditional Truemotion structure that has VP5 decoder as well as VP6 decoder and VP6 encoder. So those who wanted an open-source VP6 encoder—grab it while it’s still there (and I still have my own implementations for either of those things).
Out of curiosity I looked at the encoder and I was not impressed. It has more features (like two-pass encoding) and more refined rate control but it does not look that much better. I wonder what Peter Ross could say about it, being a developer of a popular and well-respected encoder for a codec with rather similar structure.
Overall, the code base looks like a mess with no clear structure, with most libraries shoved into one directory without any further attempt to separate them by purpose. But it does not matter as it was not intended for the large collaborative efforts and two or three programmers could live with it.
Still, let’s see if something good comes from this source release.
On over- and under-engineered codecs
Tuesday, September 10th, 2024Since my last post got many comments (more than one counts as many here) about various codecs, I feel I need to clarify my views on what I see as over-engineered codec as well as under-engineered codec.
First of all, let’s define what does not make a codec an over-engineered one. The sheer number of features alone does not qualify: the codec may need those to fulfill its role—e.g. Bink Video had many different coding modes but this was necessary for coding mixed content (e.g. sharp text and smooth 3D models in the same picture); and the features may have been accumulating over time—just look at those H.26x codecs that went a long way, adding features at each revision to improve compression in certain use cases. Similarly it’s not the codec complexity per se either: simple methods can’t always give you the compression you may need. So what is it then?
Engineering in essence is a craft of solving a certain class of problems using practical approaches. Yes, it’s a bit vague but it shows the attitude, like in the famous joke about three professionals in a burning hotel: an engineer sees a fire extinguisher and uses it to put out fire with the minimum effort, a physicist sees a fire extinguisher, isolates a burning patch with it and studies a process of burning for a while, a mathematician sees a fire extinguisher, says that there’s a solution and goes to sleep.
Anyway, I can define an over- or under-engineered codec by its design effectiveness i.e. the amount of features and complexity introduced in relation to the achieved result as well as the target goal. Of course there’s rarely a perfect codec so I’ll use a simpler metric: a codec with several useless features (i.e. those that can be thrown out without hurting compression ratio or speed) will be called over-engineered and a codec which can be drastically improved without changing its overall design will be called under-engineered. For example, an RLE scheme that allows run/copy length of zero can be somewhat improved but it’s fine per se (and the decoder for it may be a bit faster this way); an RLE scheme that uses zero value as an escape with real operation length in the following byte is under-engineered—now you can code longer runs but if you add a constant to it you can code even longer runs and not waste half of the length on coding what can be coded without an escape value already; and an RLE scheme that allows coding the same run or copy in three different ways is over-engineered.
And that’s exactly why XCF is the most over-engineered format I’ve even seen. Among other things it has three ways to encode source offset with two bytes: signed X/Y offsets, signed 16-bit offset from the current position or an unsigned 16-bit offset from the start. And the videos come in either 320×200 or 320×240 size, so unless you have some weird wrap-around video you don’t need all those addressing modes (and actually no video I’ve tried had some of those modes used). Also since the data is not compressed further you can’t claim it improves compression. Speaking of which, I suspect that wasting additional bits on coding all those modes for every block in every frame negates any potential compression gains from specific modes. There are other decision of dubious usefulness there: implicit MV offsets (so short MVs are now in range -4,-4..11,11 for 8×8 blocks and -6,-6..9,9 for 4×4 sub-blocks), randomly chosen data sources for each mode, dedicated mode 37 is absolutely the same as mode 24 (fill plus masked update) and so on.
Of course there are more over-engineered codecs out there, I pointed at Indeo 4 as a good candidate in the comments and probably lots of lossless audio codecs qualify too. But my intent was to show what is really an over-engineered codec and why I consider XCF to be the worst offender among game video codecs.
As for under-engineered codecs, for the reasons stated above it’s not merely a simple codec, it’s a codec where a passerby can point out on a thing that can be improved without changing the rest of the codec. IMO the most fitting example is Sonic—an experimental lossy/lossless audio codec based on Bonk. Back in the day when we at libav
discussed removing it, I actually tried evaluating it and ended with encoded files larger than the original. And I have strong suspicion that simply reverting coding method to the original Bonk or selecting some other sane method for residue coding would improve it greatly—there’s a reason why everybody uses Rice codes instead of Elias Gamma’. Another example would be MP3—there’s a rumour that FhG wanted it to be purely DCT-based (as AAC) but for patent holder’s consideration it had to keep QMF, making the resulting codec more complex but less effective.
P.S. The same principles are applicable to virtually everything, from e.g. multimedia containers and to the real world devices like cars or computers, but I’ll leave exploring those to the others.
Self-fulfilling excuses
Saturday, August 31st, 2024There’s this old but annoying (and sometimes infuriating) thing called self-fulfilling excuses. Essentially it’s when somebody uses a deferred reasoning “this thing will not work, because I’ll be stonewalling it until it fails, and then I can proudly claim that I was right”. Here I’ll give three examples of it.
First, there’s the tired “masks won’t work against spreading COVID-19”. From the facts I know even WHO finally admitted that the virus is airborne, so using a mask to protect yourself against it by filtering it out of the air you breathe sounds like a reasonable precaution (especially since virus size and masks filtering capabilities are known). And yet indeed masks were not an effective measure because of so many opinionated idiots who refused to not wear them. Just this year alone two random people shouted at me because they saw me wearing a mask (I don’t know who loves wearing them but I keep doing that in public confined places like shops or public transport because my health is not good enough as it is already).
Second, there’s this recent drama about Rust in Linux kernel. I can’t tell whether Rust really belongs there or if it was a mistake admitting it there in the first place—let the technical points be discussed by the experts—but the recent resignation of one of the maintainers demonstrated that the main reason why Rust code can’t work in Linux kernel because certain maintainers are against it. And you know what, if the rest of Rust kernel writers would leave and create their own kernel it may be for the best—I always supported having an alternative (this would be a good place to insert an advertisement for librempeg but Paul hasn’t provided one).
Third, there’s the situation with American help for Ukraine. Just yesterday russian plane sent bombs to my home city, killing civilians and destroying living houses. The most reasonable action would be to allow Ukraine use the provided missiles to strike those planes right at their airbases, eliminating or reducing the threat both directly and indirectly (as russians would have to operate the remaining planes from farther airbases, making their further strikes less effective). Instead USA forbids using its missiles for that purpose. Among the reasons quoted were that there are too few missiles USA can give anyway (right, nobody knows how to allocate scarce resources) and that there’s no strategic advantage to that (link provided as a proof that I’m not the one making it up). And considering that russians are taking such threat into account and have started building plane shelters on the airbases closer to Ukrainian borders, that will turn out true.
Apropos, this is an appropriate place to spotlight this saying:
If you’re looking for any solutions, you won’t find them here. This is an innate thing to many humans so I don’t believe it can be eliminated without cardinal changes to human nature (and we don’t know what and how to change), and while theoretically the consequences can be mitigated by the reasonable discussions and rational decisions, those are not common things either (and what I said above the human nature applies here as well). In other words, this world is imperfect and sucks a lot. I have more trite things to say but that’s enough for now.
On some German town names
Monday, August 19th, 2024One of my hobbies is travelling around and during my travels I see names that I find amusing like Geilhausen which can be translated as “Gay/kewl housing” or Waldfischbach (which translates to “forest fish brook” and it makes me think about what special kinds of fishes live in the forest). But today I want to talk about more IT-related town names.
Travelling from here to Switzerland first you encounter Vimbuch (“das Buch” is “book” in German and the town is named after St. Find. Really!). The only problem is that they had to wait over seven centuries in order for people to understand that it’s not named after an editor manual.
In twenty kilometres from it you can find Urloffen (“offen” is “open” in German). One can only wonder what its inhabitants did for over eight centuries before hyperlinks were invented.
Another thirty kilometres to the south you can find Rust.
And if you continue following the road you may end in Switzerland and see Speicherstraße in canton St. Gallen (“der Speicher” is “storage” e.g. HDD in German; the road is not impressive at all even compared to the old PATA buses). But let’s not go there, it’s a neutral place. After all, there’s Speicher near Bitburg and you know it’s a very safe storage if they build a castle for a single bit.
But speaking of the programming languages, there’s a place called Perl in the remotest rural place of Saarland (a federal land often regarded to be exactly the remotest rural piece of Germany in the eyes of people not living there). The fun thing is not only the fact that it borders Schengen Area (yes, that town is right across the river) but also that the rail line continues to Apache in France (since French famously suck at spelling, they forgot a letter in the web server’s name).
And that’s all names I can remember immediately not counting the mountain range Eifel but who remembers a programming language named after it anyway?
Twenty years
Wednesday, August 14th, 2024On August 14th 2004 Mike Melanson committed to FFmpeg
a decoder for TSCC codec that I sent to him. So on that day my official story of contributing to open-source multimedia has started.
A lot has changed since then and not always to the best. Back then there were a plenty of formats to reverse engineer (with quite a demand for them as well), nowadays it feels like there’s one standard video codec (or two, if you see AV1 as different enough) and Fraunh*fer Alternative-to-Opus Audio Codec (or ***-AAC for short). And the complexity increased so that it’s increasingly harder to implement a codec alone.
From the software side things have changed as well, I’ve ranted about it enough. To put it simply, I’ve not been participating in FFmpeg
(or libav
) development for the last decade and I don’t regret it at all.
And yet I still have the curiosity about various multimedia formats so I keep investigating them and sharing the results with the public. I’m not sure I’ll find enough material to last for another twenty years but there are still enough obscure games out there. Occasionally my work comes in handy for somebody else and it makes me happy (not as happy as with my first contribution, that happiness lasted for a week—but I was much younger and very inexperienced back then). That’s both the curse and the blessing of being an expert on a very narrow topic: hardly anybody needs you but when they do they have no other option. So I’ll keep providing what I can as long as possible and let’s see how it ends…
Web of Bullshit
Tuesday, July 16th, 2024I’d rather write about the current state of the world in general, how russians proved once again they don’t deserve to be called humans, how only an idiot would trust their word or believe they’re going to keep any agreement, how the general attitude of dealing with russia looks like somebody attempting to cure a disease so that the treatment does not cause any discomfort even if allows that disease to progress until it’s too late to cure it… But I’ve written all about it previously so I’ll write on a related but less crucial topic.
I ranted about the state of Firefox less than two weeks ago. And what do you know, version 128 proved out to be even worse with its attitude to the users. One could wonder how it can get worse but apparently Firefox CTO decided to give a public justification of their decision. So their answer to the war with the annoying advertisements is making sacrifices of the users’ liberties in hope that the aggressor will be satisfied with that (it always worked fine in the real world as can be seen by World War II and the ongoing World War III).
The sad thing is that the advertising is responsible for the current web of bullshit, here’s a short review.
John Wanamaker allegedly said “I am convinced that about one-half the money I spend for advertising is wasted, but I have never been able to decide which half.” It’s hard to disagree with it (except that the share of effective advertising feels much lower these days) and that’s the root of the current problems.
Considering that a lot of the first domains in the Web were belonging to the large companies no wonder ads were present there from the very beginning (a small example: one of the oldest pages on The Wayback Machine is for http://www.ads.digital.com
). But the real boom of advertising started when lots of ordinary people started to frequent it and various companies felt that there are money to be made off that. Add rather unscrupulous website designers and you get the (first) dark ages of Internet: annoying Flash banners, pop-ups and pop-unders, blinking text and so on. There’s the first bullshit tendency for you—putting as many ads on a page as the browser can render. And coming with it the second bullshit tendency—inflating content for accommodating more ads. Well, if you give people a way to profit off advertisement placement somebody is going to abuse it to death.
And then somebody came with the main bullshit idea: advertising can be targeted! Theoretically if you know enough about the person (or at least its actions and habits) you can offer that person only the relevant ads thus making the success rate close to 100%. In practice it does not work because people do not work like this at all (banner blindness exists, people usually get too scared when their “smart” device starts recommending them something they talked about in its presence, many people really want different things from what they believe they want and so on; and that’s not counting how the common pattern for recommendations is “you bought an electric stove recently, that means you want to buy another electric stove”). And this bullshit stimulated the growth of privacy violations and social networks. But I repeat myself.
So that’s all fine for the ad networks who can feed this bullshit to the entities placing those ads (as well as another one that those ads will be shown only to the target groups selected by them). Now it was time for people trying to earn money from displaying those ads (voluntarily or not) to learn that earning much from those ads is bullshit. Advertising on streaming platforms gets more and more aggressive but looks like for the content creators the main revenue source is subscriptions and donations but never the share from the ads provided by the platform (partner deals to place specific ads directly in the video may be a different case, you should know those MMOs and VPN services by heart now). Small blogs also seem to live off subscriptions and donations with an occasional native advertisement.
But of course there must be people who decide to automate the process as much as possible to get those vanishingly small amounts of money per ad click for millions of clicks. That’s how we get bullshit generated just to lure people to click on the ads (still talking about the Web and not, say, mobile games BTW) and even bot networks to click ads on bot-generated pages that were placed by the ad-bots. Some call it the Dead Internet Theory, I called it right in the title.
But it’s not all that bad, sometimes things get better: browsers learned to block pop-ups even without a separate plug-in, Flash was killed (maybe because a certain guy could not control it on his phones, or it made them look under-performing—in either case they both are dead now), there are certain legal restrictions for advertising in the Internet even in the USA let alone EU and there are ad-blockers. The main disadvantage is that major browsers are controlled by the companies depending on ad revenue (and A**le, where ads are merely a part of iExperience), Mozilla joining them recently. So it’s natural for them to try offering more data to the advertisers and restrict ad-blockers as much as possible (does anybody believe that things like Manifest V3 have any different intent?). We see the first step done by Mozilla already, crippling uBlock Origin looks like a matter of time. At least it should help Ladybird, Servo and maybe some Firefox forks to develop faster.
Suicide by thousand cuts
Friday, July 5th, 2024Firefox has finally upgraded for me to the localhost version and apparently the developers (or rather some other “creative” people, I suspect) decided to make it more secure by being unusable.
The first thing I noticed is that the last tab refused to close and now you need to close the window—that’s annoying. Then I noticed that bookmarks have disappeared from the toolbar and no matter what you do you can only get an additional blank space shown—at least they’re still accessible through the menu. Then I noticed that downloads may run but they’re not reported—now that’s extremely annoying. And cherry on the top is that closed tabs cannot be restored and recent history remains blank—now that’s borderline unusable.
And apparently the reason is that I’m using the browser wrong. From what I read, they decided to “protect” user data by introducing a session password which you apparently need to enter at the each session start. And considering that I power off (most of) my computers at night and usually launch browser for a quick private session (usually to check news or search for something and not clutter my history with URLs from the search pages and bad results) that means unwanted annoyance many times a day. And of course since I had no reason to launch the browser in non-private mode for many months, the change went completely unnoticed (and when they got rid of XUL even I knew that in advance despite not following the news that much).
Unrequested changes (like changing GUI layout, adding Pocket and so on) build up annoyance and breaking things like this make me consider using another browser. For now I see no real alternative (maybe one of its forks is good without me knowing it, or Servo or Ladybird will become usable for my needs), so I simply downgraded to version 126 for time being and switched off auto-updating but I should use dillo and elinks more.
P.S. One of the reasons why I switched to my own video player was that the previous one I used also decided to “improve” user experience in suspiciously similar ways (by not doing what it did because you apparently don’t know what you’re doing and by interpreting things differently). I definitely don’t want to get into browser development (and I lack hardware for that too) but I need to consider that option…
Just a coincidence
Tuesday, June 25th, 2024A couple of days ago I remember seeing a post that BaidUTube has started sending ads inside the video stream instead of requesting them separately. I immediately thought that re-encoding full videos would be costly and they probably would pull the same trick as Tw!tch (another company which name shan’t be taken in vain) by inserting ad fragments into HLS or DASH playlist among the ones with (questionably) useful content.
Also a couple of days ago yt-dlp
stopped downloading videos from BaidUTube in 720p for me, resorting to 360p. I don’t mind much but I got curious why. Apparently BaidUTube stopped providing full encoded videos except in format 18 (that’s H.264 in 360p) even for the old videos. The rest are audio- or video-only HLS or DASH streams.
Probably they’re just optimising the storage by getting rid of those unpopular formats and improving user experience while at it. In other words, see the post title.
P.S. I wonder if they’ll accidentally forget to mark ad segments in the playlist as such but I’ll probably see it when that happens.
P.P.S. I guess I should find another time wasting undemanding hobby. That reminds me I haven’t played OpenTTD for a long time…
The freedom* of choice
Tuesday, June 4th, 2024Since the topic is no longer hot, I can rant on it as well.
Sometimes I get asked why I name the search company with the name starting with G (and part of Alphabet) Baidu consistently throughout my blog. There are several reasons for that, mostly it’s because since they use my work without acknowledging it I don’t see a reason to promote their name either, but more importantly, I feel the company would fit well into a totalitarian regime (on the top of course, they do not want to be mere servants). And recently they’ve proved that once again.
You should be aware of the theory of enshittification by now: at first company caters to the users, then it shifts its focus to the suppliers and finally it starts to serve its own interests. I believe it is just a natural manifestation of shifting power balance but not the intents: companies want to have all money (control, whatever) without doing much work, users prefer to have everything as cheap as possible instead; so in order to get a hold on the market a company needs needs to build a user-base first, then it still has to submit to the suppliers’ wishes (since it still depends on them) until it finally gets an effective monopoly so neither the users nor the suppliers have any other option. Of course in reality there are many factors that still limit companies (sometimes EU regulations can be useful!) so it’s not as bad as it could be otherwise. But who knows, maybe we’ll see the cyberpunk future with large corporations becoming de facto states.
Anyway, back to the Internet search. Previously there was such thing as Internet—a gathering of different web sites and personal pages—and there was a need to find a piece of information of a web site of certain interest. Thus search services came into existence. Some were merely a catalogue of links for certain topics submitted by people, other crawled the Web in order to find new information (IMO AltaVista was the best one).
And then Internet matured and companies discovered that money can be made there. And that’s when we started to get annoying ads—large Flash banners, pop-ups, pop-unders and so on (I vaguely remember time before ads became that annoying but I hardly can believe in that myself). But the process has not stopped there, ad revenue meant that now the sites have a reason to attract users not merely to increase the visitors counter (yes, many sites had such widgets back in the day). That’s how we got another pillar of modern Web—SEO spam. Also with the technological progress we got large sites dedicated to organising user content (previously there were such things as GeoCities or Tripod but they were rather disorganised hosting services for random user homepages), including the worst of them—social networks. Eventually those sites tried to replace the whole Web—and it worked fine for most users who get their daily dose of news, recreation and social interaction from one or two of those sites.
So we have these megasites full with ads and generated nonsense or plagiarised content and Baidu had a reasonable idea of cutting the middle man—if you stay on one site to browse mostly generated nonsense why can’t we provide it all instead of referring you to an ad revenue for a different site? And if you think this idea is bad, there’s not much you can do about it—the very limited competition acts the same. Starting your own search service would require an insane amount of bandwidth and storage to do it right (even the large companies had their search quality declining for years because the content has exponential growth while storage space for even indexing it is limited, so you have to sacrifice something less popular). Mind you, if you limit the scope severely it may work just fine, it’s scaling to all Web content and for general audience that is rather impossible.
Now where does freedom* (yes, with marketing asterisk) of choice come into this picture?
I remember reading back in the day how The Party solved the problem of lacking resources to fulfil needs of people. They declared that the needs of the people are determined by the party (so if you think you should have other food beside bread, mashed eggplants and tinned sprats—well, that’s your own delusion that has nothing to do with your real needs). It feels that Web N.0 companies decided the same—for now mostly in the form of recommendations/suggestions but considering the growing limitations (like avoiding seeing ads on Baidu hosting using Baidu browser—at least they have not introduced mandatory quiz after the ads like reportedly one russian video hosting does) it may soon be about as good as in China (i.e. when you try to deviate from the prescribed path you’ll be gently returned to it and if you persist you’ll be punished—banning your Baidu account seems to be as bad as losing social credit score already). That’s the real freedom of choice—they’re free to choose an option for you and you’re free to choose to accept it (also known as Soviet choice).
Good thing is that most people don’t care and I can manage without. Bad thing is that it spreads elsewhere.
I’m talking mostly about various freedesktop.org
projects, especially systemd
and GNOME. In both cases the projects offered a certain merit (otherwise they would not stand out of their competition and not get support of IBM) but with the time they became too large in their domain and force their choices on Linux users. For example, systemd
may be a conceptually good init system but in reality it can work only with the components designed specifically for it (or do you have a better explanation for existence of things like systemd-timesyncd
?). Similarly GNOME is very hostile to attempts to change GUI appearance, so when third-party developers failed to take a hint with plugins and themes breaking every minor release, GNOME developers had to explicitly introduce libadwaitha
and forbid any deviations from the light and dark themes hardcoded there. At least finding an alternative there is still possible.
Well, there you have it. I’m not the first to highlight the problems and I’m not proposing a universal solution to them either. But if you ever wondered why I restrict myself on many modern technologies and NIH my own multimedia framework, here’s your answer.