Author Archive

NihAV Game Tool: the official release

Sunday, July 21st, 2024

I’m finally proud not too ashamed to present a side project I’ve been wasting my time on.

The rationale behind it is simple: I sometimes write throwaway decoders in order to check if I understood format properly or if I really want to see the decoded content. Usually it’s written in C with the same code (usually for dumping output as PPM image sequence or reading 16-bit value) copied over and over again. So I thought to borrow bits from NihAV and finally make a framework for handling output creation and various utilities for handling input (e.g. reading integers of different size and endianness). It’s still better than doing nothing and it may be marginally useful to somebody else.

Not all of the included decoders are completely my own work, some come from ScummVM via the documentation I created for them on The Wiki (but sometimes with improvements, for example they don’t handle compressed sections in PMV) and AV format comes from the Lord of the Rings engine re-implementation (again, via The Wiki).

I should also mention a nasty surprise. Apparently when AVI streams are sufficiently de-synchronised (e.g. the format pre-buffers a second of audio before sending any video frames) libavformat AVI demuxer (which is used by a lot of multimedia tools nowadays) switches to a different mode and treats palette change chunks as normal video data (which works wonders of a stream of course; and apparently nobody had this problem since the time I introduced palette change support there about twenty years ago; also my own player is unaffected). So for such formats I had to introduce manual audio buffering (it’s not nice but the alternatives are worse).

And some words about the releases and release schedule. I don’t want to bother my friend who hosts the public NihAV repositories to add another one and I want to get involved with the usual platforms even less. As the result I’ll simply dump source tarballs with a brief changelog on the site. Releases should happen irregularly, when I accumulate, say, another dozen of formats or have other features implemented (like detecting format by regex instead of just by extension or OpenDML AVI support for being able to output annoyingly large files).

But for now the source code along with some formal Git history is available at a NihAV special page. Grab it while it’s not that stale.

Web of Bullshit

Tuesday, July 16th, 2024

I’d rather write about the current state of the world in general, how russians proved once again they don’t deserve to be called humans, how only an idiot would trust their word or believe they’re going to keep any agreement, how the general attitude of dealing with russia looks like somebody attempting to cure a disease so that the treatment does not cause any discomfort even if allows that disease to progress until it’s too late to cure it… But I’ve written all about it previously so I’ll write on a related but less crucial topic.

I ranted about the state of Firefox less than two weeks ago. And what do you know, version 128 proved out to be even worse with its attitude to the users. One could wonder how it can get worse but apparently Firefox CTO decided to give a public justification of their decision. So their answer to the war with the annoying advertisements is making sacrifices of the users’ liberties in hope that the aggressor will be satisfied with that (it always worked fine in the real world as can be seen by World War II and the ongoing World War III).

The sad thing is that the advertising is responsible for the current web of bullshit, here’s a short review.

John Wanamaker allegedly said “I am convinced that about one-half the money I spend for advertising is wasted, but I have never been able to decide which half.” It’s hard to disagree with it (except that the share of effective advertising feels much lower these days) and that’s the root of the current problems.

Considering that a lot of the first domains in the Web were belonging to the large companies no wonder ads were present there from the very beginning (a small example: one of the oldest pages on The Wayback Machine is for http://www.ads.digital.com). But the real boom of advertising started when lots of ordinary people started to frequent it and various companies felt that there are money to be made off that. Add rather unscrupulous website designers and you get the (first) dark ages of Internet: annoying Flash banners, pop-ups and pop-unders, blinking text and so on. There’s the first bullshit tendency for you—putting as many ads on a page as the browser can render. And coming with it the second bullshit tendency—inflating content for accommodating more ads. Well, if you give people a way to profit off advertisement placement somebody is going to abuse it to death.

And then somebody came with the main bullshit idea: advertising can be targeted! Theoretically if you know enough about the person (or at least its actions and habits) you can offer that person only the relevant ads thus making the success rate close to 100%. In practice it does not work because people do not work like this at all (banner blindness exists, people usually get too scared when their “smart” device starts recommending them something they talked about in its presence, many people really want different things from what they believe they want and so on; and that’s not counting how the common pattern for recommendations is “you bought an electric stove recently, that means you want to buy another electric stove”). And this bullshit stimulated the growth of privacy violations and social networks. But I repeat myself.

So that’s all fine for the ad networks who can feed this bullshit to the entities placing those ads (as well as another one that those ads will be shown only to the target groups selected by them). Now it was time for people trying to earn money from displaying those ads (voluntarily or not) to learn that earning much from those ads is bullshit. Advertising on streaming platforms gets more and more aggressive but looks like for the content creators the main revenue source is subscriptions and donations but never the share from the ads provided by the platform (partner deals to place specific ads directly in the video may be a different case, you should know those MMOs and VPN services by heart now). Small blogs also seem to live off subscriptions and donations with an occasional native advertisement.

But of course there must be people who decide to automate the process as much as possible to get those vanishingly small amounts of money per ad click for millions of clicks. That’s how we get bullshit generated just to lure people to click on the ads (still talking about the Web and not, say, mobile games BTW) and even bot networks to click ads on bot-generated pages that were placed by the ad-bots. Some call it the Dead Internet Theory, I called it right in the title.


But it’s not all that bad, sometimes things get better: browsers learned to block pop-ups even without a separate plug-in, Flash was killed (maybe because a certain guy could not control it on his phones, or it made them look under-performing—in either case they both are dead now), there are certain legal restrictions for advertising in the Internet even in the USA let alone EU and there are ad-blockers. The main disadvantage is that major browsers are controlled by the companies depending on ad revenue (and A**le, where ads are merely a part of iExperience), Mozilla joining them recently. So it’s natural for them to try offering more data to the advertisers and restrict ad-blockers as much as possible (does anybody believe that things like Manifest V3 have any different intent?). We see the first step done by Mozilla already, crippling uBlock Origin looks like a matter of time. At least it should help Ladybird, Servo and maybe some Firefox forks to develop faster.

A side project for NihAV

Sunday, July 7th, 2024

Since I still have nothing much to do (and messing with browsers is a bad idea unless the situation is desperate), I decided to make a NihAV-lite project. So announcing na_game_tool.

This is going to be a simple tool to convert various game and image formats (and related) into image sequence, WAV or raw AVI (which then can be played or processed with anything conventional). I’ve begun work on it already but the release will happen when at least when I implement all planned features (which is writing image sequence in BMP format, AVI output and porting two dozen of half-baked decoders I wrote to test if I understood the format).

Why a new project? Because I have nothing better to do, it still may be marginally useful for somebody (e.g. me) and I can do some stuff not fitting into NihAV (for example, decode 3DO version of TrueMotion video split into four files) and I don’t have to bother about other stuff that fits demuxer-decoder paradigm poorly and requires inventing ways to convey format-specific information from the demuxer to the decoder. In my case I simply feed the input name to the input plugin and it returns frames of decoded audio or video data. Some hypothetical Antons might ask a question how to deal with the formats that use variable delay in milliseconds between frames instead (and I’ve implemented one such format already). To which I can reply that one can fit a square peg in a round hole in two ways—by taking a larger hole or by applying more force. The former corresponds to having, say, fixed 1000fps rate and send mostly the same frames just to have constant rate; the latter is forcing a constant framerate and hoping for the best. I chose the latter.

The design is rather simple: there’s a list of input plugins and output plugins. Input plugin takes input name, opens whatever files it needs, outputs information about the streams and then decoded data. Output plugin takes input name, creates whatever files it needs, accepts stream information and then receives and writes frames.

Probably there’s a better alternative with librempeg but you’d better go read about it on Paul’s blog.

Suicide by thousand cuts

Friday, July 5th, 2024

Firefox has finally upgraded for me to the localhost version and apparently the developers (or rather some other “creative” people, I suspect) decided to make it more secure by being unusable.

The first thing I noticed is that the last tab refused to close and now you need to close the window—that’s annoying. Then I noticed that bookmarks have disappeared from the toolbar and no matter what you do you can only get an additional blank space shown—at least they’re still accessible through the menu. Then I noticed that downloads may run but they’re not reported—now that’s extremely annoying. And cherry on the top is that closed tabs cannot be restored and recent history remains blank—now that’s borderline unusable.

And apparently the reason is that I’m using the browser wrong. From what I read, they decided to “protect” user data by introducing a session password which you apparently need to enter at the each session start. And considering that I power off (most of) my computers at night and usually launch browser for a quick private session (usually to check news or search for something and not clutter my history with URLs from the search pages and bad results) that means unwanted annoyance many times a day. And of course since I had no reason to launch the browser in non-private mode for many months, the change went completely unnoticed (and when they got rid of XUL even I knew that in advance despite not following the news that much).

Unrequested changes (like changing GUI layout, adding Pocket and so on) build up annoyance and breaking things like this make me consider using another browser. For now I see no real alternative (maybe one of its forks is good without me knowing it, or Servo or Ladybird will become usable for my needs), so I simply downgraded to version 126 for time being and switched off auto-updating but I should use dillo and elinks more.

P.S. One of the reasons why I switched to my own video player was that the previous one I used also decided to “improve” user experience in suspiciously similar ways (by not doing what it did because you apparently don’t know what you’re doing and by interpreting things differently). I definitely don’t want to get into browser development (and I lack hardware for that too) but I need to consider that option…

REing another simple codec

Saturday, June 29th, 2024

Since I was bored I tried to (ab)use discmaster.textfiles.com to search for interesting (i.e. unsupported) samples once again. The main problem is that if it does not decode contents it does not recognize the format. So e.g. AVI files without video track (yes, those files exist) and those using some unrecognized codec will be both marked as aviAudio format, and if audio stream is absent or unknown as well the file gets demoted to unknown.

So I tried to search AVI and MOV files both by extension and by this audio-only type and here are the categories of the results:

  • actual audio-only files (that’s expected);
  • completely different format (there’s an alternative AVI format and MOV is very popular extension as well);
  • improperly extracted files (rather common with MOV on hybrid Macintosh/PC CDs where resource fork often gets ignored);
  • damaged files (happens with some CDs and very common with AOL file library collection—often AVI data starts somewhere in the middle of the file);
  • too old or poorly mastered files (for example, one AVI file lacks padding to 16 bits between chunks; some MOV files can’t be decoded while they look correct);
  • one Escape 130 that could’ve been supported if libavcodec AVI demuxer would not feed garbage to the decoder (it’s not just my demuxer that can handle it, old MPlayer 2 plays it fine with its own demuxer);
  • some TrueMotion 1 files that were not recognised because of tmot FOURCC;
  • files with some special features of the known codecs (I’ve seen some MOV files containing QDraw codec with JPEG frames);
  • files with the codecs I can decode (like IPMA) but the popular software can’t;
  • files with the known codecs (some documented by me) that nobody bothered to implement (especially Motion Pixels 1 and 2);
  • and finally some AVIs with savi FOURCC and a single file with DKRT FOURCC.

Those “SuperAVI” files turned out to be a rebranded Cinepak which I managed to recognise right away, the remaining file turned out to be a bit baffling. After extracting the frames I figured out that it is raw YV12 video, but for some reason it had 64 bytes of soemthing before the image data and 440 bytes after. It can be located on TNG Klingon Language Disc but it does not look like the software there can decode it anyway.

Overall, nothing hard or interesting (if you don’t count the questions about the origins of that file, that is).

Just a coincidence

Tuesday, June 25th, 2024

A couple of days ago I remember seeing a post that BaidUTube has started sending ads inside the video stream instead of requesting them separately. I immediately thought that re-encoding full videos would be costly and they probably would pull the same trick as Tw!tch (another company which name shan’t be taken in vain) by inserting ad fragments into HLS or DASH playlist among the ones with (questionably) useful content.

Also a couple of days ago yt-dlp stopped downloading videos from BaidUTube in 720p for me, resorting to 360p. I don’t mind much but I got curious why. Apparently BaidUTube stopped providing full encoded videos except in format 18 (that’s H.264 in 360p) even for the old videos. The rest are audio- or video-only HLS or DASH streams.

Probably they’re just optimising the storage by getting rid of those unpopular formats and improving user experience while at it. In other words, see the post title.

P.S. I wonder if they’ll accidentally forget to mark ad segments in the playlist as such but I’ll probably see it when that happens.

P.P.S. I guess I should find another time wasting undemanding hobby. That reminds me I haven’t played OpenTTD for a long time…

A look at an obscure animation system

Tuesday, June 25th, 2024

Since I have nothing better to do, I looked at a thing I’ve encountered. There’s a system developed by some Japanese going by nickname “Y.SAK” that consists of compressed bitmaps (in whatever files) and scripting system using them for displaying animations (that’s .bca files) or even complex scripts (that’s .bac files, don’t confuse them) that may react on mouse, set or test variables and even invoke programs.

Of course the only part I was really interested in were those compressed bitmaps. They have 48-byte header starting with ‘CS‘ and containing author’s copyright, then the header part of DIB file follows (including palette) and finally the compressed data. Apparently there are two supported compression methods—RLE and LZSS. The latter is the familiar code used in many compressors for various things, but RLE is surprisingly interesting. Its opcode contains copy/run flag in the top bit and either 7-bit copy value or 3-bit run length plus 4-bit run value index. Maximum run length/index values mean you need to read the following byte for the real value for each. But that’s not all, note that I wrote “run value index“. There’s a table of possible run values sent before the actual compressed data and that index tells which 4-byte entry from it should be repeated for the run. Nothing revolutionary of course but still a rather curious scheme I don’t remember mentioned anywhere.

And that’s why I keep digging for this old stuff.

REing non-Duck VP X1

Thursday, June 13th, 2024

While I’m still looking for a solution on encoding video files with large differences with TrueMotion, I distract myself with other things.

Occasionally I look at dexvert unsupported formats to see if there’s any new discovery documented there in video formats. This time it was something called VPX1.

I managed to locate the sample files (multi-megabytes ones starting with “VPX1 video interflow packing exalter video/audio codec written by…” so there’s no doubt about it) and an accompanying program for playing them (fittingly named encode.exe). The executable turned out to be rather unusable since it invokes DPMI to switch to 32-bit mode and I could not make Ghidra decompile parts of the file in 386 assembly instead of 16-bit one (and I did not want to bother to decompile it as a raw binary either). Luckily the format was easy to figure out even without the binary specification.

Essentially the format is plain chunk format complicated by the fact that half of the chunks do not have size field (for palette chunk it’s always 768 bytes, for tile type chunk it’s width*height/128 bytes). The header seems to contain video dimensions (always 320×240?), FPS and audio sampling rate. Then various chunks follow: COLS (palette), SOUN (PCM audio), CODE (tile types) and VIDE (tile colours). Since CODE is always followed by VIDE chunk and there seem to be a correlation between the number of non-zero entries in the former and the size of the latter, I decided that it’s most likely a tile map and colours for it—and it turned out to be so.

Initially I thought it was a simple bit map (600 bytes for 320×240 image can describe a bit map for 4×4 tiles) but there was no correlation between the number of bits set and bytes in tile colours chunk. I looked harder at the tile types and noticed that it forms a sane 20×30 picture so it must be 16×8 tiles. After some more studying the data I noticed that nibbles make more sense, and indeed only nibbles 0, 1, 2 and 4 were encountered in the tile types. So it’s most likely 8×8 tiles. After gathering statistics on nibbles and comparing it to tile colours chunk size I concluded that type 2 corresponds to 32 colours, type 4 corresponds to 1 colour and type 1 corresponds to 16 colours. Then it was easy to presume that type 4 is single-colour tile, type 1 is downscaled tile and type 2 is a tile type with doubling in one dimension. It turned out that type 2 tile repeats each pixel twice and also uses interlacing (probably so video can be decoded downscaled on really slow machines). And that was it.

Overall, it is a simple format but it’s somewhat curious too.

P.S. There’s also DLT format in the same game which has similarly lengthy text header, some table (probably with line offsets for the next image start) and paletted data in copy/skip format (palette is not present in the file). It’s 16-bit number of 32-bit words to skip/zero followed by 16-bit number of 32-bit words to copy followed by the 32-bits to be copied, repeat until the end. Width is presumed to be 640 pixels.

P.P.S. I wonder if it deserves a support via stand-alone library named libvpx1 or libvpx and if this name is acceptable for Linux distributions.

Duck Control 1: update

Monday, June 10th, 2024

I’ve been working on TM encoder then and now and finally I have some things to say about it.

First of all, general state of the things: the encoder works and produces valid output for both methods 1 and 3 (the encoding is still not perfect but hopefully it can be fixed), it still lacks audio encoding (I need to add WAV reading support to the encoder and extend my decoder to test the output).

Second, I also decided to add an auto-selection option which allows encoder to decide whether to use method 1 or method 3 for the frame. It simply decides which one to use depending on the percentage of most common pair and the number of unique pairs present in total. It does not seem to have any practical use but it may be handy to test decoders that expect only one coding method to be present in the stream.

And now let’s move to the most interesting thing in all this format (at least to me): codebook generation. TrueMotion (1 and 2X) is a rare example of a codec using Tunstall coding (the only other known codec is CRI P256), essentially an inverse Huffman coding where a fixed-length code corresponds to a sequence of symbols.

The original codebook construction goes something like this: add all symbols to the codebook, while the space allows replace most probable entry with new strings using this old entry as a prefix. E.g. for {0 1 2} alphabet (with 0 being the most probable symbol) and size 8 codebook initially you’ll have just the same {0 1 2}, then {00 01 02 1 2} and finally {000 001 002 01 02 1 2} (and you can add another code there to make it full).

Of course it’s rather impractical in this form as not all sequences will be encountered in the data and you still need to code smaller sequences (e.g. how would you code exactly four zeroes with the above codebook?). Thus I decided to do it a bit differently: I only add new sequences without deleting old ones and I also keep a (limited) statistics on the sequences encountered (from two to twelve symbols) so first I add all encountered pairs of symbols, then select most commonly occurring sequence and add all known children of it (i.e. those with an additional pair of symbols at the end), mark it as ineligible candidate for the following search and repeat the process again until the codebook is full. If somebody cares about implementation details, I used a trie for holding such information as it’s easy to implement and understand; and during update process I keep a list of trie nodes for the previously encountered sequences up to maximum depth so I can update all those sub-sequence statistics in one pass over input.

Does it make a difference? Indeed it does. I took the original LOGO.DUK (the only video with a different codebook), decoded it and re-compressed using the default codebook all other videos are using as well as the using the one generated specifically for it. Here are the results:

  • original .duk size—2818868 bytes;
  • re-compressed file size—2838062 bytes;
  • re-compressed with file-specific codebook—2578010 bytes.

That’s using the same method 3 as the original file. With method 1 file sizes with the standard or custom codebook are 2622758 and 2490058 bytes respectively.

As you can see, the difference is noticeable. Of course it requires two passes over input and many megabytes of memory to store the sequence statistics, but the results may be worth it. In theory the compression may be improved even further if you know how to generate a codebook that allows splitting frame data into unique chunks but that sounds a lot like an NP-hard problem to me.

Anyway, I got what I wanted from it so it just requires some bugfixing, audio encoding support, polishing and documenting. After that I can dump its source code for all zero users and forget about Duck codecs until something even more exotic manages to re-surface.

Some words on IBM PhotoMotion

Thursday, June 6th, 2024

After a recent rant about search systems I decided to try to find any information about the format (I just happened to recollect that it’s supposed to exist). I don’t know if anybody was lucky but for me the search results were mentions in the list of FOURCCs, some passing references in two papers and that’s all. Now it will probably start returning more results from multimedia.cx domain though 😉

So what should we do when a generic search engines fail? Resort to the specialised ones of course. Thanks to the content search feature of discmaster.textfiles.com I was finally able to locate a CD which uses PhotoMotion technology with both video files and the official player (aptly named P7.EXE, I couldn’t have given it a better name myself). Even better, video files were encoded as both AVI and MM so I could check what output to expect.

Of course Peter’s decoder can’t handle them properly because of the larger header (26 bytes instead of usual 22 or 24 bytes) and uncompressed intra frames. But it was simple to write a simple stand-alone decoder for it to validate that both PhotoMotion and game samples are decoded fine.

This is no major achievement of course but at least it answers a question what that format is all about. So even if there’s still no information about an alleged VfW decoder, now we know what to expect from it.