Archive for January, 2022

VP7 encoding: general principles

Sunday, January 30th, 2022

It is not that hard to write a simple encoder (as I’m going to demonstrate), the problem is to make it good (and that’s where I’ll fail). Until that time I’m going to explain what I’m doing and how/why it should be done.

Starting work on VP7 encoder

Wednesday, January 26th, 2022

As I said in the previous post, currently I don’t have any features or decoders to add to NihAV (because Paul has not finished his work on Bink2 decoder yet) beside some encoders that nobody will use.

Thus I decided to work on something more advanced than VP6 that allows me to play with more advanced features (like EPZS motion estimation, per macroblock quantiser selection and such). For that I needed to pick some codec probably based on H.264 and there was not that much to pick from:

  • ITU H.264—first and foremost, I don’t have a properly working decoder for it (yet?); second, the format is too complex so just thinking about writing all those SPSes, PPSes and various lists discourages me from even considering to write an encoder for it;
  • RealVideo 3 or 4—tempting but that means I also need to write a RealMedia muxer and the format lacks dquant (in theory it’s supported, in practice it’s never happened). Maybe one day I’ll make my own NihAV-Really? encoder for RV3+Cooker but not today;
  • Sorenson SVQ3—same problems essentially;
  • VP8—Mike has done it over a decade ago;
  • VX—this is a custom game codec which is simplified (even quantiser is rather implicit).

The rough roadmap is the following:

  1. make intra-only encoder that encodes picture somehow;
  2. improve it to select the best whole macroblock prediction mode;
  3. add 4×4 prediction mode and make it select the best mode;
  4. add inter-frame support along with motion compensation;
  5. add EPZS-based motion estimation;
  6. introduce rough motion search for group of frames to determine good golden frame candidate and the macroblocks that should be coded with higher quality;
  7. actually code those macroblocks with higher quality using MB features;
  8. use trellis-based quantiser search for improved coding of frames;
  9. speed it up by using various heuristics instead of brute force search for coding parameters.

This should take some time…

Looking at SMUSH/INSANE formats

Thursday, January 6th, 2022

As some of you might know, I had an interest for various game formats for decades (and that’s one of the reasons that brought me into opensource multimedia). And those formats include videos from LucasArts games as well. Actually SMUSH is not an ordinary video format but rather a sub-engine where both audio and video are objects (background, sprites, main audio, sound effects) that should be composed into final audiovisual experience. INSANE is the next iteration of the engine that became simpler (coding full frames, only one object per frame, just one codec, 16-bit video instead of paletted one) but it shares a lot in common with its predecessor.

As expected, the main source of information about those come from ScummVM (and one of their developers made smushplay to play the files in stand-alone matter). There’s a personal story related to that: one Cyril Zorin meddled with some formats from LucasArts games and wanted to add INSANE support (for Grim Fandango but it’s the same for all other games using SNM format) in FFmpeg, sadly he could not stomach review process there (which is hard to blame him for) and abandoned it; some time later I picked it up, added support for SMUSH codecs 37 and 47 (the ones used in adventure games) and got it committed; years later Paul B. Mahol (of future Bink2 decoder fame) added VIMA audio support to it.

Yet there are more games out there and some of them use different codecs, for which details were not previously known. So I decided to finally reverse engineer them to see how the development went. My implementation in NihAV is far from being perfect (there are many issues with transparency and coordinates) but it can decode all files I could encounter with very few exceptions.

So, let’s look at the codecs used for image coding. Audio is rather boring: there’s very old PCM format in SAUD chunks, scaled PCM audio in IACT chunks and VIMA is IMA ADPCM with 2-7 bits per step.