Archive for the ‘Various Video Codecs’ Category

Looking again at LSVX

Tuesday, November 29th, 2022

Recently Paul B. Mahol asked me to take a look at LSVX codec (aka Lightning Strike Video Codec from Espre). Since the guy is working on Bink2 decoder for all of us, he deserves some respect. So here’s what I found.

Previously I took a glance on it, found that it’s based on tmndec 3.0, kinda the reference decoder for H.263 and concluded it is an ordinary H.263 rip-off and hadn’t looked further. It turned out to be more interesting though.

LSVX frames start with 5-byte header where the second byte tells the frame type (0x01, 0x08 and 0x09 are for normal frames, 0x05 is from skip frames; additionally if the first two bytes of the header are 0x78 0x01 then the header is followed by another eight bytes, usually with “lsvxv2.0” in them). Then almost normal H.263 stream follows—at least on a prolonged glance it seems to be the standard H.263 stream without any modifications of the headers (except maybe in motion vectors decoding). But there’s a catch! Key frames may have picture type 7 (reserved code in the standard) and then they’re coded with wavelets.

The wavelet coding is rather straightforward: you have a picture split into many bands, each of them is quantised and coded with a semi-adaptive binary coder. By that I mean that it uses models with fixed probabilities but it selects different context for different bits depending on previously decoded bits and in some cases the entier sets of probabilities may be switched mid-decoding. Beside that there are no special tricks like zero tree coding or fancy coefficient prediction.

Maybe I’ll write a decoder for it after all.

Looking at KQ6 Mac videos

Friday, November 25th, 2022

The terrorist country proves that it is recognized as one and keeps targeting civilians instead of fighting a war. So nothing new but it would still be nice to see its demise soon. Meanwhile I keep doing small things to distract myself from all this.

Since I have nothing better to do, I watch reviews of various games including the ones I know well. And one of those reviews mentioned that Macintosh version of King’s Quest VI: Heir Today Gone Tomorrow had peculiar intro. Actually every version of the game has something peculiar about its intro: DOS version uses Sierra’s own RLE-based SEQ format, Windows version uses standard MS Video 1 in AVI (it was my first sample with palette change messages), Amiga version is a reimplementation by Revolution Software on their Virtual Theatre engine altogether (maybe ScummVM will support it one day for all three and a half fans waiting for that). So, what’s with Macintosh version?

First of all, the files are QuickTime movies in the original Macintosh format where frame data is stored in the data fork and movie header is stored inside the resource fork. Since not all modern OSes support such files natively (or conveniently), I’ve hacked a support for such movies in MacBinary format that keeps all forks in one file. And what do we have inside?

Inside the files are video streams packed with Cinepak. One of the peculiarities is that they have palette specified in video header in the format different from the conventional MOV color atom format, let alone the fact it should not be present at all. I understand that for Cinepak and even more for Indeo 3 (I really should write an encoder for it one day) it was common to provide a palette so they rendered their output for 256-colour mode but in that mode Cinepak simply coded palette indices and here we have YUV420 output and a palette as a recommendation.

Then there’s a fun case with tracks in KQ6Movie. I understand that they split video and coded it in several tracks so they could use different palettes (and framerates as it turns out) for different segments. And those tracks are not in order. Tracks 0 and 1 seem to be the very beginning, track 2 corresponds to a scene somewhere in the middle and track 3 is the last intro scene. Other ten tracks are not in order either. Maybe there is some information hidden in the header telling the order but I’m too lazy to find it out (let alone implement).

All in all, this was unexpectedly weird.

A quick look at PDQ2

Friday, October 21st, 2022

So there’s this new service discmaster.textfiles.com for searching for files from some old disks and discs at archive.org and out of curiosity I took a look at what it offers.

While it’s physically impossible to check all hundreds of millions of files it indexes (even just videos), I took a look at one rather curious category called aviAudio. There are about six hundred files of such type found so I decided to check them to see what kind of beast that is.

Some files turned out to exactly AVI audio files—AVI files with single stream that turns out to be audio (I don’t know why but this happens). The rest of the files were AVIs with unrecognized video track (just what I was hoping for!). One set of the samples come from some game with IPMA codec, other sources mostly feature Motion Pixels video (which is horrible both to RE and to support so I’d rather not touch it), one sample was special “codec” for some russian erotic game (which means complete garbage even by russian game standards) that is really just obfuscated Indeo 5. And finally there was a single PDQ2 sample (from some Incredible Hulk game demo from 1997).

An interesting thing is that it’s a DOS game with built-in AVI demuxing and decoding of this codec. The codec turned out to be a lot like various lossless codecs (so its FOURCC probably stands for something like “pretty damn quick codec, version 2”). It splits video into 4×4 tiles and emits up to three streams stored one after another: tile opcode (copy previous tile, copy previous data with an arbitrary offset, skip tile, raw tile), variable offsets and tile pixels. Additionally frame data may be packed further with LZ77-based scheme. As I said, a rather familiar scheme for lossless video codecs and it makes me wonder whether it’s a codec developed specifically for the game(s) or it was borrowed from somewhere else.

Looking at true IVF

Thursday, October 13th, 2022

So while russia demonstrates once again what a hysterical führer and his similarly minded generals would do, I’m trying to do something to distract myself from the thoughts about the losses Ukraine suffered in last couple of days (and previous couple hundreds of days too).

Accidentally I’ve managed to find a sample in IVF format—the original streaming format for Indeo codecs. Despite the expectations it was not a renamed AVI file which made it even more interesting. So I took Ghidra and binary specification and started implementing it.

The format by itself is rather simple and the only interesting thing is that it transmits video data in passes: first it’s just bare minimum keyframes and all other frames as drop frames, then some inter frames are sent, then more data is sent and more. Unfortunately there’s no obvious field to tell where each part starts so I simply assemble frames in memory (which is not effective but still better than the original creating temporary files for assembling fragments).

Then there was a problem with decoding them: my existing decoder expected a frame with sequential structure while in this case fragments are transmitted out of order. Normally you’d get all four bands for luma plane and then a band for each chroma planes but in this case it may be one or two bands for luma plane, then chroma plane bands, then some zero padding and then the rest of the bands for luma (if they were transmitted at all). In the reference implementation the demuxer seems to parse and re-assemble the frame while I ended up writing a slightly different decoder for this scalable mode. It seems to work fine so now NihAV support for Indeo family of formats is even more complete than ever.

P.S. The Wiki had only the entry for Duck test samples format (that has the same extension) so somebody had to correct that. Which also makes me think about things like Duck AVC, AVS being an FMV game format or Chinese non-Duck AVC rip-off, IEEE AAC and so on. This confusion deserves to be written about but don’t expect me to do that.

German PEGs

Thursday, September 29th, 2022

Having said everything I could on the current political situation, I returned to looking at random codecs and I found one with a curious name.

The name is DPEG and it’s used in at least one random game I’ve never heard of. It turned out to be a rather simple tile-based codec with raw intra frames and inter frames that employ RLE and motion compensation.

So nothing interesting but then it struck me: wait a bit, I remember REing a codec with almost the same name, block-based RLE coding approach and also from a German company (a different one though).

And there’s this Fraunhofer society involved in MPEG Video and MPEG Audio (different branches though)…

So what’s the German fascination with naming codecs [A-Z]PEG and how many out of 22 possible codecs are really implemented?

A Quick Look at AVS3

Monday, August 15th, 2022

The war has shifted to a terrorist operation against Ukrainian civilians (with no change for my home city, it gets several strikes from russian territory every day regardless) and instead of threatening the world with nuclear war russia threatens the world with nuclear terrorism using the captured Ukrainian nuclear power plant. So here’s yet another attempt to distract myself from thoughts about it.

It seems that AVS3 has been standardised already (and nobody cares). So out of idle curiosity I’ve downloaded the spec from avs.org.cn (in Chinese of course, and it requires you to fill some information but accepted any garbage). So let’s look at this completely original format that has not borrowed anything neither from H.266 nor from AV1.
(more…)

A cursory glance at Daniel codecs

Sunday, July 10th, 2022

Recently Paul B. Mahol turned my attention to the fact there are codecs with human names so why not take a look at them?

The first disappointing thing is that you need to register in order to receive a link for downloading that software. Oh well, throwaway mail services exist and it’s not the first codec I’ve seen doing that (though I’ve forgotten its name already like I’ll forget about this codec). Then it’s the binary size. I remember thinking that Go2Somewhere single 15MB .dll is too much but here’s is even larger (because they’ve decided to bundle a bunch of other decoders and demuxers instead of just one codec).

In either case, what can I say about the codec? Nothing much really. They both seem to be DCT-based intermediate codecs that group blocks into slices. Daniel2 uses larger tiles in slices, probably to accommodate for the wider variety of supported chroma formats (unlike its predecessor it supports different colourspaces, chroma subsamplings and bitdepths). The claimed high (de)coding speed comes from the same approach as in VBLE (does anybody remember it? I still remember how Derek introduced its author to me at one of VDDs). Yes, they simply store coefficients in fixed amount of bits and transmit bit length for each tile to use.

The only really curious thing I’ve found is some combinatorial coding approach I’ve never seen anywhere else. Essentially it stores something like sum in a table and for each value only the number of table entries is transmitted. Actual value is decoded as (max(table[0], ..., table[len - 1]) + min(table[0], ..., table[len - 1]) + 1) / 2 and then the decoded value is subtracted from all table elements used in the calculation. I have no idea why it’s there and what it’s good for but it exists.

Overall, it was not a complete waste of time.

Looking at Aware MotionWavelets

Sunday, December 26th, 2021

I wanted to reverse-engineer and implement some wavelet codec just for the sake of it. And finally I’ve managed to do that.

Initially I wanted to finish Rududu Video codec (I’ve looked at it briefly and one of the funny things is that the opensource release of Rududu Image codec does not match the actual binary specification, even arithmetic coder is different), but it turns out there’re no samples in the usual place so I just picked something that has some samples already.

The codec turned out to employ some tricks so I had to resort to collecting debug information in order to understand band structure (all band dimensions are implicit, you need to know them and the order to decode it all successfully). Then it turned out that band data is coded in boustrophedon order instead of the usual raster scan. And finally there’s fun with scaling: vertical transform is the same as horizontal one but the output is scaled by 128. Beside that it’s rather unremarkable.

Anyway, I got slightly deeper knowledge about the inner workings of wavelet codecs and it should not bother me any longer. It’s time to slack off before doing something else.

Some words on QT Animation (SMC) codec

Tuesday, August 10th, 2021

A recent question about buggy SMC decoding led me deep into QuickTime specification to look at the codec missing opcode. And there are some noteworthy things here as well.

Back in the day there was the multimedia player for Unix called XAnim. Its last release was in 1999—before other opensource multimedia player projects have started! It was both feature-rich (e.g. it could step frames forward and backwards, something that not all current media players can do) and had an excellent codec support for the time.

Somehow its author reverse engineered (long before the era of decompilers too) a lot of codecs and somehow managed to obtain the sources for e.g. Indeo and while he could not provide them, he offered them for a wide variety of architectures—Alpha, MIPS, Sparc, PowerPC, x86. It was a treasure trove for formats and lots of the decoders were ported to other projects (even I did that for one or two codecs) and binary codecs were a great help in reverse-engineering efforts as well.

Now to SMC itself. Formally it’s QuickTime Animation codec but people call it after its FOURCC which is “smc “, probably after the author’s initials.

Opensource SMC decoders come from the same source (I based mine on the description in The Wiki but you can guess what that description is based on; and yes, back in the day e.g. MPlayer and Xine had their own decoders for various codecs before relying on libavcodec for everything). After looking at the binary specification I can say it looks exactly like it was reverse engineered from it directly (it has the same logic and data types but lacks sensible names). Anyway, the thing is that it does not handle opcode 0xF0 and I finally had an occasion to look at it.

I took QuickTime 6.3 binary specification for Windows (somehow the decoder ended in QuickTimeInternetExtras.qtx) and looked inside. It turns out that there are several decoding functions there (for different output formats) but they all do the same: handle 0xF0 opcode in exactly the same way as 0xE0 opcode (raw blocks), there are no differences there whatsoever.

That’s one mystery less, even if the answer is a bit disappointing. At least I could reminisce about good old times hardly anybody else remembers.

About upcoming AV2…

Friday, August 6th, 2021

So today I’ve seen an article titled AV2 Video Codec — Early Performance Evaluation of the Research which of course has drawn my attention.

Fun things are that it is a sponsored article and that it’s written by three engineers from ViCueSoft. This is strange, but so far it still looks more promising than the original AV1 feature review article with over 20 authors and too much marketing in it (my review of it is here; and to be fair it was followed by more serious paper with less authors but this one exists as well). Anyway, let’s see what is presented here.

I don’t care about the performance much so I just quote the phrase from the conclusion: “…rough approximation shows only 1.2x times encoding complexity increase and 1.4x time decoding”. I find the increase in decoding complexity being larger than the increase of encoding complexity a bit strange, normally you’d expect encoding difficulty rising faster because of the nature of the coding approach in modern codecs (normally an encoder needs to search for the best combination of encoding tools and their parameters and then apply the same steps as decoder does in order to have a coded frame in the same state as decoder would have it). Let’s look at the features then, it’s the most interesting part to me anyway.

  • distant weighted compound mode and dual interpolation filter are removed;
  • semi-decoupled partitioning is introduced—this feature allows splitting luma and chroma blocks and code their contents independently under certain level. The paper also says there’s Dual Tree feature in VVC that does the same;
  • quantiser step overhaul—instead of six tables in AV1 now you have just one simple formula for all quantiser step;
  • extending motion sample selection to work with compound blocks as well;
  • more partitioning modes to be more like HEVC;
  • multiple reference line selection for intra prediction—allows you to select not just neighbouring row/column for directional intra prediction. The same tool exists in VVC. And it also reminds me of X8 frames in WMV2/WMV9, that is the first case of intra prediction using more than one line known to me;
  • offset-based intra prediction refinement—adding some offset to the top/left intra predicted edge of the block to make it even smoother (the offset is calculated from the neighbouring blocks as well);
  • intra secondary transform—this tool tries to improve compression by applying a special secondary transform to the low-frequency coefficients. VVC has low-frequency non separable transform doing the same;
  • simplifications in intra mode signalling;
  • some improvements in motion prediction coding;
  • cross-component sample offset—another chroma-from-luma tool: for the whole CTU between deblocking and CDEF stages a DC offset is calculated from the luma values and applied to chroma values.

Essentially there are three kinds of improvements: simplification or generalisation of the existing feature (including complete removal of it—I approve either), picking the tool used by VVC/H.266 (that approach works but lacks originality) and an occasional improvement of an existing tool (too few and not too original). Of course nobody knows when AV2 will be declared finished and some things will surely have changed by then, but I don’t expect radical changes.

Once I said that I’ll review H.266 when AV2 is released but these guys has essentially done my work instead of me. Thanks!