While I’m still looking for a solution on encoding video files with large differences with TrueMotion, I distract myself with other things.
Occasionally I look at dexvert
unsupported formats to see if there’s any new discovery documented there in video formats. This time it was something called VPX1.
I managed to locate the sample files (multi-megabytes ones starting with “VPX1 video interflow packing exalter video/audio codec written by…” so there’s no doubt about it) and an accompanying program for playing them (fittingly named encode.exe
). The executable turned out to be rather unusable since it invokes DPMI to switch to 32-bit mode and I could not make Ghidra
decompile parts of the file in 386 assembly instead of 16-bit one (and I did not want to bother to decompile it as a raw binary either). Luckily the format was easy to figure out even without the binary specification.
Essentially the format is plain chunk format complicated by the fact that half of the chunks do not have size field (for palette chunk it’s always 768 bytes, for tile type chunk it’s width*height/128
bytes). The header seems to contain video dimensions (always 320×240?), FPS and audio sampling rate. Then various chunks follow: COLS
(palette), SOUN
(PCM audio), CODE
(tile types) and VIDE
(tile colours). Since CODE
is always followed by VIDE
chunk and there seem to be a correlation between the number of non-zero entries in the former and the size of the latter, I decided that it’s most likely a tile map and colours for it—and it turned out to be so.
Initially I thought it was a simple bit map (600 bytes for 320×240 image can describe a bit map for 4×4 tiles) but there was no correlation between the number of bits set and bytes in tile colours chunk. I looked harder at the tile types and noticed that it forms a sane 20×30 picture so it must be 16×8 tiles. After some more studying the data I noticed that nibbles make more sense, and indeed only nibbles 0, 1, 2 and 4 were encountered in the tile types. So it’s most likely 8×8 tiles. After gathering statistics on nibbles and comparing it to tile colours chunk size I concluded that type 2 corresponds to 32 colours, type 4 corresponds to 1 colour and type 1 corresponds to 16 colours. Then it was easy to presume that type 4 is single-colour tile, type 1 is downscaled tile and type 2 is a tile type with doubling in one dimension. It turned out that type 2 tile repeats each pixel twice and also uses interlacing (probably so video can be decoded downscaled on really slow machines). And that was it.
Overall, it is a simple format but it’s somewhat curious too.
P.S. There’s also DLT format in the same game which has similarly lengthy text header, some table (probably with line offsets for the next image start) and paletted data in copy/skip format (palette is not present in the file). It’s 16-bit number of 32-bit words to skip/zero followed by 16-bit number of 32-bit words to copy followed by the 32-bits to be copied, repeat until the end. Width is presumed to be 640 pixels.
P.P.S. I wonder if it deserves a support via stand-alone library named libvpx1
or libvpx
and if this name is acceptable for Linux distributions.