May « 2026 « Kostya's Boring Codec World

Archive for May, 2026

AI-coholism

Friday, May 29th, 2026

With this matter, I hope I actually shan’t have more to write about. But this thought needed to be written out (only if to unload it from my head).

For some reason the current state of affairs around “AI” reminds me a lot of alcohol. Mind you, it’s a useful substance (as a solvent or a disinfectant), some people may even use it in recreational purposes or to deal with stress—but there are always some people abusing it, often for no good reason, and giving bad name to it all. Also there were countries whose budget was largely dependent on alcohol over-consumption but that’s less important here.

The situation with “AI” assistants looks entirely familiar: some people use it responsibly and even achieve useful results with it, but it’s usually other people (a loud but hopefully still minority) who abuse such tools to the point they get high with imaginary power and behave themselves as typical drunks: some boast how much they can consume (tokens or pints of beer), some vomit uncontrollably (you can see the results on GitHub—when it is still up of course), some suffer from neuro-toxic effects (and can’t move or code straight any longer without an external support), some lose all internal restraints (there’s little difference between alcoholics raising ruckus because their favourite liquor store is closed when they want to get a drink and AIcoholics trying to push slop to some project without caring what the change does and why the project in question doesn’t want to accept it). And occasionally you can see the entertaining stunts that would cause normal person a serious damage (like falling from a third floor or deleting a production database) but with a person in question not realising that at all.

Hopefully the situation with “AI” will mature and normalise so that the technology abusers will be shamed for their actions and become an exception instead of current “sport fans right after the match” vibe.

Posted in Useless Rants | 2 Comments »

We have FLI at home

Friday, May 29th, 2026

Recently I’ve released na_eofdec, a tool for decoding exotic and/or obscure formats. That release included F16 format support, but recently I’ve REd another one (PC Animate Plus / 3D WorkShop animation) and there’s yet another one waiting in the queue (Reflections animation). What unites them all is that they all employ simple compression schemes (mostly RLE-based) and (beside F16) they all have a 3D modelling program associated with them. And I’ve REd these formats by investigating the file format, they’re that simple.

F16 posed itself as “FLI but 16-bit” and it looks like its creators have failed to build an ecosystem out of it. I have encountered merely two demo samples at discmaster and nothing else. From technical point of view it’s either uncompressed intra frames or the rather familiar FLI delta compression scheme with its number of skip/run opcodes per line, just with small variations.

PC Animate Plus is more interesting as it has 4-, 8- and 16-bit content compressed, at least two format versions and several compression schemes. Plus it has some additional chunks for complex operations and even metadata telling e.g. which Voc file to play along. Intra frames are RLE-compressed, inter frames usually employ FLI delta compression (with small changes of course) but there’s another mode consisting of offsets and chunks of data to update. Another interesting thing is that it does not replace old pixels but XORs them with new data instead (maybe it comes from an alternative universe where 80×86 had REP XORMOVSB instruction).

And there’s Reflections animation. While I haven’t written a decoder for it yet, I can describe it already. I’m aware of three samples with rather uncommon 320×256 resolution and big-endian format. First frame is uncompressed, the rest seems to be simply “skip N 32-bit words, update M 32-bit words”. ~~Writing a decoder for it should not be that hard…~~ An update from the next day: it’s simple RLE with opcodes being skip/repeat/copy but the actual data is stored in 4-pixel columns format, so e.g. copying 8 pixel quads will result in 4×8 rectangle.

Individually the formats are nothing to write about but together they form a group of FLI clones that poses some interest. Now that I’m done with MVS it’s either extending QuickTime support in NihAV or REing obscure formats, hopefully it will give me more material for my writing.

Posted in Various Video Codecs | No Comments »

FFhistory: first slop

Wednesday, May 27th, 2026

While I observe the world with its “AI” evangelists suspiciously reminding of annoying religious missionaries (yes, I’m pretty sure I’ve heard the news from that newer part of widely circulated book that’s just under two millennia old, thank you very much) and the feats of token-wasting (name changed from “vibe-coding” to keep up with the times) like two FFmpeg rewrite attempts in Rust—probably just to spite the Nigel (name changed to protect the guilty) formerly responsible for FFaccount, since slop in any other language would be as ~~smelly~~ secure. Since I don’t use either of those three projects, I’d rather talk about the time when FFmpeg almost got its first organic slop.

People submitting sub-par patches are no news (as there were e.g. mediocre H.264 encoder rejected for not being good for anything really—x264 is a tough competition after all; or MS Video-1 encoder initially rejected for the same reason but later merged because it’s a feature), but this one is special because it had all signs of the modern “AI” slop while being produced organically more than a decade ago: doing something tangentially related to the original goal—check, being lots of incomprehensible code—check, a lot of effort wasted onto it—guess for yourselves.

This happened when a guy from a group Programmers Doing Awesome Things (name changed to protect the guilty) was taking part in Baidu Summer of Code (name changed to reflect company values) with his project being a support of a certain audio format. What we got instead was a large library doing something more generic; in theory it could be used to decode the audio format in question but I think nobody has found out how to do that. The reaction was more “uhm, thanks” and while that student was not failed (at least that’s what a quick search tell me), the library has never merged and probably it’s been completely lost in time by now. My memory is not as bad as it was back then (yes, it’s even worse) so I can’t remember if there were actual attempts to make something out of it afterwards or all hope was abandoned outright. At least it gave us all a distinct memory and a short-lived meme of “nicknamePDAT” being used by various developers for a while.

I often think about it when I see these new projects with whatever insane amount of tokens wasted on them. They seem to include everything and then some more. For example, one of them (name withheld since I believe they don’t deserve any advertising) supports a handful of formats and compensates that by adding a lot of features that (theoretically) would make it do anything—from game streaming to mastering IMF for broadcasting—with only GUI being missing. Another one (name withheld for the reason stated above) does not have those features but it compensates it by the plethora of formats being supported. So if you ever thought that FFmpeg definitely needs its own vector font rendering (for e.g. SVG and PDF support because of course they’re at least planned to be supported) or that it’s not usable without 3D scene rendering capabilities then this slop is definitely for you! Also it’s fun to watch how it undoes its own progress by trying to make “AI” developer to plagiarise less (so now it’s all based on the “AI”-generated specifications that nobody can see).

You know what could really improve those projects? Actually having a point. I know that the main goal there is to make money off it (and it even works for some FFmpeg developers, so it may work here), but in order to achieve that it needs to offer potential users a solution for their problems (again, like FFmpeg started with open-source implementation of decoding and encoding popular formats based on H.261-H.263 and grew up from there into something that most people use to decode or convert their multimedia content). And a pile of code that does everything and nothing at the same time is not it. Actually I encountered one of those project by searching a crate with libxvid bindings (and got only that thing in the search results, which doesn’t support even what my decoder does let alone the stuff I’d rather use libxvid for).

There was a joke about one hardware company (name not given since I forgot it) that its motto was “ready! shoot! aim!”. With modern tools people are so excited that they can shoot a lot, with minimum readying time, that they forgot about aiming entirely. So I’ll wait aside while the rest have fun shooting bystanders and themselves and keep doing what nobody else cares about.

Posted in FFhistory, Useless Rants | 6 Comments »

na_eofdec initial release

Monday, May 11th, 2026

Since I got lucky during weekend with some formats, I got enough of them to release na_eofdec. This is a tool similar to na_game_tool but oriented at generic exotic and obscure formats (or Amiga ones, put them into whatever category you like). So if you’re familiar with that one (why?!) you should have no troubles with the new one either (or at least it should be the same troubles).

The motivation behind it is about the same as with the other tool: decode whatever formats I find interesting enough to implement decoders for but not interesting enough to have them supported in the main NihAV base. Also it serves as a playground for various other things (like MOV muxer in this case, which served as the base for more versatile muxer in NihAV).

Anyway, it is released in hope (but no expectations and definitely no guarantees) that it will be useful for some purpose for others. The release is available at its own sub-page at nihav.org (and there’s a link to it in the appropriate section of this blog too).

Posted in NihAV | 2 Comments »

NihAV: QT support enhancements

Friday, May 8th, 2026

When I have enough inspiration, I improve NihAV. When I don’t (which is more common state to me), I RE codecs or write blog posts—so here’s one.

First of all, I’ve started adding non-raw encoders for some common QT formats. It’s not that there are no open-source encoders for them, but I do them mostly to find out how it is done and maybe learn something new in the process. For instance, RLE encoding combines skips, runs and pixel copies; this rises the question of optimal encoding as sometimes it may be cheaper to encode a whole area as new pixels instead of a mix of copy+skip+copy. So I’ve implemented a greedy approach (i.e. code longest skip or run and fall back to encoding raw if those two fail) as well as slow but optimal one. It’s a variation of trellis coding: just calculate encoding cost with each mode (skip/run/raw) to all next possible positions and if it’s lower than the existing one, use that mode; at the end simply trace back the decisions that gave least cost at the end and encode them in right order.

Then I also added RPZA encoder. This is essentially the first texture codec before GPUs with the need for texture compression, its main compression mode is encoding 4×4 block with four colours where two colours are linearly interpolated from two explicitly transmitted colours. There is no apparent way on how to do it fast, so I ended up with an extremely simplified scheme: first I calculate the maximum difference between components and pick the one with the largest difference (or code block as single-colour if it’s small enough) to decide what values to pick, then I calculate explicit colours from an average of input pixels close to minimum and maximum ends of that range. I also have a refinement step by running vector quantisation loop to adjust the ends but it’s rarely needed in practice.

There are still more encoders to implement (SMC, SVQ1, IMA ADPCM and MACE) but none of them is interesting beside SVQ1, so probably I’ll write about it when/if I ever get to implementing an encoder for it (it does not matter if The Multimedia Mike has done that over fifteen years ago—NIH is there in the project name for a reason).

Now, surprisingly enough I’ve improved decoding support as well. The original QuickTime had SIVQ codec which is a straightforward 256-entry codebook for 2×2 RGB24 tiles followed by codebook indices. I had read its binary specification some time ago and recently I was able to locate (probably the only existing) sample for it, which is a good reason to write a decoder for it. It was well-spent five minutes of my time. Maybe in the future I’ll also do something about Pixar codecs (Ghidra works better with raw m68k version of the decoder than with 16-bit Windows 3.x version of the same).

And finally I’ve improved the support for multi-descriptor MOV files. I mentioned it some time ago and I got bitten by it again recently. For example, alice_lo_m.mov from samples.mplayerhq.hu got just first frame decoded for me and many QuickTime 1.5 sample videos (with its developers) gave an error on the last frame. For the former it’s because first frame is JPEG and the rest of them are SVQ1, while the latter samples are coded with Cinepak but the last frame may be a special one encoded with RPZA. And there was another file fully encoded with RPZA—but with the majority of it being 160×120 while last dozen of frames or so were 320×240. So I finally got annoyed enough to implement multiple streams per track so at least the frames get marshalled to the correct decoder, even if it leads to the partial streams being rather unusable. Maybe one day I’ll write a tool which will walk through MOV and render all tracks in correct sequence (taking edit lists into account), scaling and adjusting playback rate as needed, producing a raw MOV file that can be played without special hacks; or maybe I don’t hate myself that much.

That’s it for now, don’t expect anything soon (MVS description may appear but who’s waiting for that?).

Posted in NihAV, QuickTime, Various Video Codecs | No Comments »

Quickly about AC2

Saturday, May 2nd, 2026

I’ve finally had a look at this codec and it’s not particularly remarkable (and I did it mostly because it beats documenting minute details of MVS).

Anyway, there’s nothing much interesting about it. Like its successor, it employs parametric bit allocation to read bits from the input block. The main difference is that there are only two channels allowed and there’s single block per channel too. More curious thing is there are two revisions of this codec, with revision A having simpler bit allocation while revision B has additional tables (dependent on sample rate of course) to adjust how many bits per band will be eventually read. Also unlike successor there are no tricks to allocate fractional amount of bits per band, it’s always an integer amount of bits.

From coding side it seems to be more or less straightforward MDCT with the only interesting trick is splitting frame data into sums and differences (not channels but the subsequent samples in frequency domain) and coding them in such separate matter.

Overall, it’s a simple perceptual codec (and a half, considering revision B) that worked not that bad. And considering the claims given here, I suppose it was essentially equivalent to MPEG I Layer II. At least it’s more interesting than a bit of audio coding bolted to the (patented of course) DRC and named AC4.

Posted in Audio | No Comments »

Kostya's Boring Codec World