vp6enc: slightly faster encoding mode

March 12th, 2022

As I mentioned before, I wanted to try to apply the macroblock selection approach from VP7 encoder in VP6 encoder. Well, it was easy to implement: instead of preparing all macroblock modes and then trying which is the best one now it tries macroblock modes (starting with inter mode now) and stops when the result is good enough. In this mode encoding seems to go couple percent faster and the resulting size at the same quantiser can differ somewhat in both directions. You can try it yourself by using fast encoder option.

The encoder is still a failure though.

VP7 encoder release

March 3rd, 2022

While a certain country cosplays the Third Reich and conducts talvisota simultaneously—and tries to bomb my home city to debris, I still need some distraction…

borrowed from vp7.de

Since I’m rather bored with VP7 encoder I’ve decided to release it and move to something else. It should work about the same as VP6 encoder (i.e. poorly and nobody should care about it) but if you want to know what knobs you can turn just invoke nihav-encoder --query-encoder-options vp7 (but I guess the only useful options are to set bitrate/quantiser and keyframe interval).

Have fun!

Update from March 4: encoding with low quantisers should now work as well.

VP7 encoder: various bits

February 27th, 2022

As the world tries to avert attention from an insane dictator re-enacting 1939 (it gets funnier since I observe it from Germany), I should also do something to take my mind off constant worrying about my parents and other relatives in one of the Ukrainian cities under attack. Hence this significantly less unpleasant thing.

Now my encoder is conceptually done, all that is left to do is to fix a leftover bug or two, improve a thing of two, clean the code up and integrate it nicely with the rest of nihav-duck crate by splitting off common parts with VP6 encoder. Meanwhile I can talk about some things implemented since the last time and what wasn’t.
The Prayer

February 24th, 2022

I do not like to state my political views publicly but sadly this is the right occasion.

I’m not a religious man so I know only just one prayer, the main Ukrainian prayer:

Дякую тобі, Боже, що я не москаль.

(translation: “thank you, God, that I’m not a Russian”). We live in a sad world where I’m really grateful for that.

The problem with opensource encoders

February 20th, 2022

Disclaimer: this post is about the general situation with existing (and even more, with non-existing) opensource encoders (for both audio and video) and not about the flaws in those encoders.

When I was developing my toy(ish) VP6 encoder, I got questions about it and general encoding technologies from many people (as in “one, two, many” but still it’s above the expected amount of zero). And then I remembered the reasons why there was no opensource VP6 encoder before I wrote one.

The main problem with opensource encoders is the shortage of talented people and the lack of environment to grow more of them. As the result, those who know how to write or tune encoders keep doing that or move to some other stuff (nowadays most of them who are remaining active seem to be sucked into rav1e and those who don’t know how to write encoders have very hard time learning how it should be done.
Basic VP7 encoder: cutting corners

February 17th, 2022

I’ve more or less completed a basic failure of VP7 encoder. Now it can encode inter-frames using various size of motion compensation (and the resulting file can be decoded too!). There’s still a lot of work to be done (rate control, MB features and multiple frame analysis) but there are some things that I can talk about as well.

As I wrote in the previous post, there are too many coding parameters to try so if you want to have a reasonable encoding done in reasonable time you need to cut corners (or “employ heuristics” if you want to sound more scientific) in various ways. So here I want to present what has been done in my decoder to make it run fast.
Looking at Zig programming language

February 5th, 2022

Back when I wrote my rant about C++ and its bad influence on C (yeah, about three quarters of year ago) I got recommendations to look at Zig and finally decided to download 0.9.0 release and play it. Long story short: it’s an interesting language with some good ideas but not the one I’d use.
VP7 encoding: general principles

January 30th, 2022

It is not that hard to write a simple encoder (as I’m going to demonstrate), the problem is to make it good (and that’s where I’ll fail). Until that time I’m going to explain what I’m doing and how/why it should be done.
Starting work on VP7 encoder

January 26th, 2022

As I said in the previous post, currently I don’t have any features or decoders to add to NihAV (because Paul has not finished his work on Bink2 decoder yet) beside some encoders that nobody will use.

Thus I decided to work on something more advanced than VP6 that allows me to play with more advanced features (like EPZS motion estimation, per macroblock quantiser selection and such). For that I needed to pick some codec probably based on H.264 and there was not that much to pick from:

  • ITU H.264—first and foremost, I don’t have a properly working decoder for it (yet?); second, the format is too complex so just thinking about writing all those SPSes, PPSes and various lists discourages me from even considering to write an encoder for it;
  • RealVideo 3 or 4—tempting but that means I also need to write a RealMedia muxer and the format lacks dquant (in theory it’s supported, in practice it’s never happened). Maybe one day I’ll make my own NihAV-Really? encoder for RV3+Cooker but not today;
  • Sorenson SVQ3—same problems essentially;
  • VP8—Mike has done it over a decade ago;
  • VX—this is a custom game codec which is simplified (even quantiser is rather implicit).

The rough roadmap is the following:

  1. make intra-only encoder that encodes picture somehow;
  2. improve it to select the best whole macroblock prediction mode;
  3. add 4×4 prediction mode and make it select the best mode;
  4. add inter-frame support along with motion compensation;
  5. add EPZS-based motion estimation;
  6. introduce rough motion search for group of frames to determine good golden frame candidate and the macroblocks that should be coded with higher quality;
  7. actually code those macroblocks with higher quality using MB features;
  8. use trellis-based quantiser search for improved coding of frames;
  9. speed it up by using various heuristics instead of brute force search for coding parameters.

This should take some time…

Looking at SMUSH/INSANE formats

January 6th, 2022

As some of you might know, I had an interest for various game formats for decades (and that’s one of the reasons that brought me into opensource multimedia). And those formats include videos from LucasArts games as well. Actually SMUSH is not an ordinary video format but rather a sub-engine where both audio and video are objects (background, sprites, main audio, sound effects) that should be composed into final audiovisual experience. INSANE is the next iteration of the engine that became simpler (coding full frames, only one object per frame, just one codec, 16-bit video instead of paletted one) but it shares a lot in common with its predecessor.

As expected, the main source of information about those come from ScummVM (and one of their developers made smushplay to play the files in stand-alone matter). There’s a personal story related to that: one Cyril Zorin meddled with some formats from LucasArts games and wanted to add INSANE support (for Grim Fandango but it’s the same for all other games using SNM format) in FFmpeg, sadly he could not stomach review process there (which is hard to blame him for) and abandoned it; some time later I picked it up, added support for SMUSH codecs 37 and 47 (the ones used in adventure games) and got it committed; years later Paul B. Mahol (of future Bink2 decoder fame) added VIMA audio support to it.

Yet there are more games out there and some of them use different codecs, for which details were not previously known. So I decided to finally reverse engineer them to see how the development went. My implementation in NihAV is far from being perfect (there are many issues with transparency and coordinates) but it can decode all files I could encounter with very few exceptions.

So, let’s look at the codecs used for image coding. Audio is rather boring: there’s very old PCM format in SAUD chunks, scaled PCM audio in IACT chunks and VIMA is IMA ADPCM with 2-7 bits per step.
