So, for a last month or even more (it feels like an eternity anyway) I was mostly trying to force myself to write a MPEG-4 ASP decoder. Why? Because I still have some content around that I’d like to play with my own player. Of course I have not implemented a lot of the features (nor am I going to do that) but even what I had to deal with made me write this rant.
(more…)
Archive for the ‘NihAV’ Category
Woes of implementing MPEG-4 ASP decoder
Friday, October 11th, 2024Announcing na_game_tool 0.2.0
Saturday, September 14th, 2024As hinted in my previous posts, I’ve released na_game_tool
version 0.2.0. Of course it is available at its dedicated page (and now there’s a link to that page in the “Multimedia Projects” link category). Here I’d like to talk about what’s changed in it and in NihAV
.
Probably the most important part is a support of a dozen new video formats—that’s the main task the tool should do after all. I’m not going to duplicate the changelog already presented at its page so those zero people who care may look there. But in the same time it’s worth nothing that I’ve move support for three obscure formats from NihAV
(i.e. adapted the code from there and removed it from nihav-game
crate). After all, I added support for some game formats there in the first place because I had no other place for that but I did that mainly to test whether I understood their workings right. Now as na_game_tool
exists it’s much more appropriate place for such experiments. Maybe I’ll move more formats in the future but not those I care about (like Smacker or VMD) and I might still add new ones like game-specific AVI codec (like IPMA or KMVC).
Now I want to talk about less important technical details. One of the improvements I’m proud about is adding a better detection using regular expression on filename. It is described in a post about Nova Logic FMV formats and it solves a rather annoying problem of certain formats having different flavours that are impossible to detect from the file contents but the naming helps distinguish those.
Also I’ve fixed all spotted bugs in AVI output and added OpenDML support for writing large AVIs. This was needed because some Arxel Tribe games have 24-bit videos in 800×450 resolution and over a minute long so decoded raw AVIs exceed 2GB in certain cases (and intro and advertisement videos from Ur-Quan Masters approached 1GB limit, ending results in ~1.4GB AVI). Additionally I’ve added BMP format for image sequence writer (I don’t think anybody will use it but maybe it’ll come in handy one day for analysing frame palettes).
Having said that, I put the project on pause. As mentioned in the post dedicated to the first release of the tool, my intent is to make releases when it accumulates support for at least a dozen new formats (the ones moved from NihAV
do not really count)—and I don’t expect to encounter enough games with undiscovered video formats in the near future. Not that I’d refuse to work on it if it’s not too annoying (see my previous post on Psygnosis games for an example of those) but it may take a year or more to collect enough formats (plus some time to reverse engineer them), so I’d rather concentrate on other things like documenting already implemented formats.
Looking at Nova Logic FMV formats
Friday, August 23rd, 2024So there’s this company known mostly for helicopter simulators and those games use KDV format for animation. That alone is enough to make me look at the formats and what have I found?
- the original Comanche game used a simple FMV based on PCX RLE. It was enough to just look at the file with a hex viewer to figure out all of it;
- Armored Fist used KDV file with a different format which I still figured out after looking at it with a hex viewer and correcting my mistakes after image reconstruction somewhat started to work. The main lucky guess was that it uses 2-bit block modes for 4×4 blocks; then figuring out that block modes use 1/2/4/16-colour mode with patterns;
- Werewolf vs. Comanche has almost the same format but for some reason Armored Fist reads only 3 colours for 4-color block (first colour is explicitly zero) so this one was more natural to RE first;
- Comanche Gold tweaked format again, with the main change being that mode 3 means skip/fill/paint new block in the same way as the previous block, using a new pattern (for patterned modes) but keeping old colours. For this and the following format I actually had to look into the binary specification;
- and finally Ukrainian planes simulators (F-16/MiG-29) use 24-bit version of the format above (adding cached colours, more about it below).
The main principle remained the same for all but the first KDV format: split picture into 4×4 blocks, use two-bit mode for each block, pack four modes into one byte followed by block mode data, modes being skip, fill, 2-colour pattern mode, 4-colour pattern mode and raw, colour 0 being used for the pixels that should be left unchanged (or to signal the special mode extension). The details changed though: in first two flavours raw mode was signalled by “fill with 0” mode, in last two flavours it’s “2-colour pattern with the same/zero colours” signals raw block mode and “fill with 0” means skip. An interesting feature of 24-bit format is that it introduced colour cache: if the first colour byte has top bit set, it is really an index in the cache (low 7 bits of that byte are), otherwise it is the first byte of 24-bit colour value which should be also added to the cache (it is organised as a circular buffer, so 128th value replaces 0th, 129th value replaces 1st and so on). And it still uses 0,0,0 as “leave unchanged” value (or a special flag for raw mode when 2-colour pattern uses two zero triplets).
I’ve added support for them all to na_game_tool
(a couple more formats and I can make a 0.2 release before switching to a completely different activity) and I want to say some words about format detection.
As mentioned above, the main problem with the format is distinguishing the flavours. It’s very simple for the last two—Comanche Gold uses larger header and 24-bit format uses larger header, lacks palette chunks and stores frame data in K24i
chunks (instead of KDVi
chunks as the rest). The problem is to distinguish the first two formats without resorting to an ugly hack by trying to decode it both ways and seeing which one sticks. That’s where a new feature of na_game_tool
comes into play—detecting format by regular expression. Since I know that most Werewolf vs. Comanche videos start with prefix “w_” I could simply add a rule for it, just like this:
DetectConditions { demux_name: "kdv-wc", extensions: ".kdv", regex: "bumper.kdv,w_*.kdv", conditions: &[CheckItem{offs: 0, cond: &CC::Str(b"xVDK")}, CheckItem{offs: 4, cond: &CC::Eq(Arg::U32LE(0x14))}] },
So files with this extension, regex and magic will be detected as this flavour of KDV while files not matching it (but with the same extension and header) will be classified as files from Armored Fist. It’s not something useful for a general multimedia converter but it’s useful for the game files—another case is CRH format that has two flavours for two different games (differing by the hardcoded height and number of bits for image offset) where you also have to rely on the file names to distinguish which one is which.
Of course I have some other changes for the upcoming na_game_tool
release (maybe in September, provided I can find enough formats to add by that time) but they’re equally unexciting. Keep staying out of the loop.
Re-revisiting Actimagine VX
Friday, August 2nd, 2024As Paul said, I probably should do something more useful (and he should do something better than still messing with jbmpeg
fork) but we’re all better at giving good advises than at following them.
Recently I was contacted by a guy who asked for my help on dealing with VX format as he needed video part extracted from it and there were no tools available. And he also provided some samples that helped clarify things I got previously wrong, for example the seek table and framerate field as well as stereo audio support.
Also since I bothered to extract video part (the only way to determine its length is to parse it, so now I have a stripped-down version of video decoder doing that) I decided to finally do something about the audio part. So after fixing some things (e.g. missing filter reconstruction or stereo support) now my decoder can decode VX files with recognizable audio (not perfect but who can do better is encouraged to do so). The sad thing is that I could not use my old setup (a console emulator with GDB stub) for debugging so I’m not likely to ever improve it.
At least it was a fun diversion and helped to improve the format knowledge a bit.
P.S. Meanwhile I’m still working on adding new formats for na_game_tool
, hopefully I’ll have something to write about another game format soon.
NihAV Game Tool: the official release
Sunday, July 21st, 2024I’m finally proud not too ashamed to present a side project I’ve been wasting my time on.
The rationale behind it is simple: I sometimes write throwaway decoders in order to check if I understood format properly or if I really want to see the decoded content. Usually it’s written in C with the same code (usually for dumping output as PPM image sequence or reading 16-bit value) copied over and over again. So I thought to borrow bits from NihAV
and finally make a framework for handling output creation and various utilities for handling input (e.g. reading integers of different size and endianness). It’s still better than doing nothing and it may be marginally useful to somebody else.
Not all of the included decoders are completely my own work, some come from ScummVM via the documentation I created for them on The Wiki (but sometimes with improvements, for example they don’t handle compressed sections in PMV) and AV format comes from the Lord of the Rings engine re-implementation (again, via The Wiki).
I should also mention a nasty surprise. Apparently when AVI streams are sufficiently de-synchronised (e.g. the format pre-buffers a second of audio before sending any video frames) libavformat
AVI demuxer (which is used by a lot of multimedia tools nowadays) switches to a different mode and treats palette change chunks as normal video data (which works wonders of a stream of course; and apparently nobody had this problem since the time I introduced palette change support there about twenty years ago; also my own player is unaffected). So for such formats I had to introduce manual audio buffering (it’s not nice but the alternatives are worse).
And some words about the releases and release schedule. I don’t want to bother my friend who hosts the public NihAV
repositories to add another one and I want to get involved with the usual platforms even less. As the result I’ll simply dump source tarballs with a brief changelog on the site. Releases should happen irregularly, when I accumulate, say, another dozen of formats or have other features implemented (like detecting format by regex instead of just by extension or OpenDML AVI support for being able to output annoyingly large files).
But for now the source code along with some formal Git history is available at a NihAV special page. Grab it while it’s not that stale.
A side project for NihAV
Sunday, July 7th, 2024Since I still have nothing much to do (and messing with browsers is a bad idea unless the situation is desperate), I decided to make a NihAV
-lite project. So announcing na_game_tool
.
This is going to be a simple tool to convert various game and image formats (and related) into image sequence, WAV or raw AVI (which then can be played or processed with anything conventional). I’ve begun work on it already but the release will happen when at least when I implement all planned features (which is writing image sequence in BMP format, AVI output and porting two dozen of half-baked decoders I wrote to test if I understood the format).
Why a new project? Because I have nothing better to do, it still may be marginally useful for somebody (e.g. me) and I can do some stuff not fitting into NihAV
(for example, decode 3DO version of TrueMotion video split into four files) and I don’t have to bother about other stuff that fits demuxer-decoder paradigm poorly and requires inventing ways to convey format-specific information from the demuxer to the decoder. In my case I simply feed the input name to the input plugin and it returns frames of decoded audio or video data. Some hypothetical Antons might ask a question how to deal with the formats that use variable delay in milliseconds between frames instead (and I’ve implemented one such format already). To which I can reply that one can fit a square peg in a round hole in two ways—by taking a larger hole or by applying more force. The former corresponds to having, say, fixed 1000fps rate and send mostly the same frames just to have constant rate; the latter is forcing a constant framerate and hoping for the best. I chose the latter.
The design is rather simple: there’s a list of input plugins and output plugins. Input plugin takes input name, opens whatever files it needs, outputs information about the streams and then decoded data. Output plugin takes input name, creates whatever files it needs, accepts stream information and then receives and writes frames.
Probably there’s a better alternative with librempeg but you’d better go read about it on Paul’s blog.
ARMovie: trying codec wrappers
Tuesday, May 7th, 2024I’ve managed to locate two ARMovie samples with video format 600 and 602 (which is M$ Video 1 and Cinepak correspondingly), so I got curious if I can decode them. The problem is that ARMovie stores video data clumped together in a large(r) chunks so you need to parse frame somehow in order to determine its length.
Luckily in this case the wrapped codec data has 16-byte header (first 32-bit word is either frame size or a special code for e.g. palette or a skipped frame, followed by some unknown value and real codec width and height) so packetising data was not that hard. The only problem was that Video 1 stream was 156×128 while the container declared it as 160×128 but after editing the header it worked fine.
Supporting such wrappers actually poses more of a question of design for NihAV
—how to link those rather container-specific packetisers to a demuxer. Making demuxer parse all possible formats is rather stupid, my current solution of simply registering global packetiser and hoping there’s no need for another is a bit hacky. I should probably make it namespaced so that the code first probes e.g. “armovie/cinepak” packetiser before “cinepak” one but it’s an additional effort for no apparent gain. Speaking of which, I should probably change code to feed the stream provided by the packetiser to a decoder (instead of always using the one from demuxer) but since I’m lazy I’m not going to do that either.
Anyway, I’m not going to spend more time on ARMovie unless new samples for the formats I don’t support show up (beside newer Eidos Escape codecs which are supported elsewhere already). There are other formats left to look at. For example, I’ve made a good progress with Adorage animation format.
ARMovie: towards NihAV support
Saturday, April 27th, 2024Since I had nothing better to do I’ve documented various ARMovie codecs in The Wiki. Essentially what’s left is to document newer Moving Blocks variants (and implement a decoder for one with samples), early Eidos Escape codecs (no idea where to find samples for them though). Also there’s Iota Software codec 500 which has no samples but looks like it’s used primarily in ACE Film format so figuring out nuances of LZW compression there may help with yet another obscure format.
From the technical side the most annoying thing is that data is stored in multiple frames in a single chunk without any kind of size, separator or frame end marker being present. The original player simply expected decoders to report number of bytes consumed after decoding a frame. Which means packetisers that should parse the stream in the same way as decoders do and report the frame size—nothing too complex but still annoying (and I had to augment NAPacketiser
API to take reported stream details into account, otherwise it has no way to know frame dimensions). The worst one was Moving Blocks parser. The main problem is that unlike most of the formats this one has frames not aligned to 16 bits—while video chunks are—so it has to deal with possible padding byte at the end of chunk.
Another annoying thing I haven’t really dealt with is detecting all those various sound formats from the format string.
I don’t know if I bother adding various raw video formats support but in either case I don’t regret looking at ARMovie. Even simple codecs turned out to be not so simple and sometimes with interesting peculiarities like run of colour pairs and even four-colour pattern painting in what should’ve been a simple RLE codec.
NihAV: nothing left to do
Saturday, November 11th, 2023If anybody read my previous posts, he might’ve picked a notion about me complaining that there’s nothing left to do for NihAV
and it is really a problem I have.
Since the (re)start of the project in 2017 it grew from a small package that could only read bits and bytes to a collection of crates supporting various multimedia formats and a set of tools to use them. I had two principal goals: to experiments with the framework design and learn how various multimedia concepts are implemented and also (ideally) make an independent converter and player so I don’t have to rely on the external projects for most of my multimedia needs.
(more…)
HW accel for NihAV player: fully done
Saturday, October 21st, 2023As mentioned in the previous post, I’ve managed to make hardware acceleration work with my video player and there was only some polishing left to be done. Now that part is complete as well.
The worst part was forking cros-libva
crate. Again, I could do without that but it was too annoying. For starters, it had rather useless dependencies for error handling for the cases that either are too unlikely to happen (e.g. destroying some buffer/config failed) or rather unhelpful (i.e. it may return a detailed error when opening a device has failed but for the rest of operations it’s rather unhelpful “VA-API error N” with an optional error explanation if libva
bothered to provide it). I’ve switched it to enums because e.g. VAError::UnsupportedEntrypoint
is easier to handle and understand when you actually care about return error codes.
The other annoying part was all the bindgen
-produced enumerations (and flags). For example, surface allocation is done with:
display.create_surfaces( bindings::constants::VA_RT_FORMAT_YUV420, None, width, height, Some(UsageHint::USAGE_HINT_DECODER), 1)
In my slightly cleaned version it now looks like this:
display.create_surfaces( RTFormat::YUV420, None, width, height, Some(UsageHint::Decoder.into()), 1)
In addition to less typing it gives better argument type check: in some places you use both VA_RT_FORMAT_
and VA_FOURCC_
values and they are quite easy to mix up (because they describe about the same thing and stored as 32-bit integer). VAFourcc
and RTFormat
are distinct enough even if they get cast back to u32
internally.
And finally, I don’t like libva
init info being printed every time a new display is created (which happens every time when new H.264 file is played in my case) so I added a version of the function that does not print it at all.
But if you wonder why fork it instead of improving the upstream, beside the obvious considerations (I forked off version 0.0.3, they’re working on 0.0.5 already with many underlying thing being different already), there’s also CONTRIBUTING.md
that outright tells you to sign Contributor License Agreement (no thanks) that would also require to use their account (which was so inconvenient for me that I’ve moved from it over a year ago). At least the license does not forbid creating your own fork—which I did, mentioning the original authorship and source in two or three places and preserving the original 3-clause BSD license.
But enough about it, there’s another fun thing left to be discussed. After I’ve completed the work I also tried it on my other laptop (also with Intel® “I can’t believe it’s not GPU”, half a decade newer but still with slim chances to get hardware-accelerated decoding via Vulkan API on Linux in the near future). Surprisingly the decoding was slower than software decoder again but for a different reason this time.
Apparently accessing decoded surfaces is slow and it’s better to leave processing and displaying them to GPU as well (or offload them into main memory in advance) but that would require too many changes in my player/decoder design. Also Rust could not optimise chroma deinterleaving code for chroma (in NV12 to planar YUV conversion) and loads/stores data byte-by-byte which is extremely slow on my newer laptop. Thus I quickly wrote a simply SSE assembly to deinterleave data reading 32 bytes at once and it works many times faster. So it’s good enough and I’m drawing a line.
So while this has been rather useful experience, it was not that fun and I’d rather not return to it. I should probably go and reverse engineer some obscure codec instead, I haven’t done that for long enough.