Starting yet another failure of an encoder

As anybody could’ve guessed from Cook encoder, I’d not want to stop on that and do some video encoder to accompany it. So here I declare that I’m starting working on RealVideo 4 encoder (again, making it public should prevent me from chickening out).

I can salvage up some parts from my VP7 encoder but there are several things that make it different enough from it (beside the bitstream coding): slices and B-frames. Historically RealMedia packets are limited to 64kB and they should not contain partial slices (grouping several slices or even frames in the packet is fine though), so the frame should be split during coding. And while Duck codecs re-invent B-frames to make them still be coded in sequence, RealVideo 4 has honest B-frames that should be reordered before encoding.

So while the core is pretty straightforward (try different coding modes for each macroblock, pick the best one, write bitstream), it gives me enough opportunity to try different aspects of H.264 encoding that I had no reason to care about previously. Maybe I’ll try to see if automatic frame type selection makes sense, maybe I’ll experiment with more advanced motion search algorithms, maybe I’ll try better heuristics for e.g. quantiser selection.

There should be a lot to keep me occupied (but I expect to spend even more time on evading that task for the lack of inspiration or a sheer amount of work to do demotivating me).

4 Responses to “Starting yet another failure of an encoder”

  1. Anon says:

    If all existing codecs suck, maybe you can amuse yourself by designing one that doesn’t. Remember one-CD MPEG-4 ASP movie encodes? They were awful, of worse quality than VHS. (My personal opinion). Surely it’s possible now to design a better codec which can still play on a Pentium III, for example. After all, MPEG codecs are not the fastest in software.

    I know that this suggestion is only slightly less ridiculous than a suggestion to try to prove Fermat’s last theorem in your spare time. Just throwing out an idea.

  2. Kostya says:

    Nah, I’ve decided not to design my own formats. And I suspect a somewhat smarter (i.e. not just transmitting codebook indices) vector quantisation codec may fit the requirements, the same way as Cinepak did on all those consoles. It’s just the encoding would be extremely slow even on modern hardware (and most codecs are targeted to be more symmetric).

  3. Anon says:

    > And I suspect a somewhat smarter (i.e. not just transmitting codebook indices) vector quantisation codec may fit the requirements, the same way as Cinepak did on all those consoles.

    well, VQ-only codec will also suffer from blocks, probably worse than DCT.
    Also Cinepak didn’t even have motion compensation. DCT+deblockig is unfortunately good on movie content.

    > It’s just the encoding would be extremely slow even on modern hardware (and most codecs are targeted to be more symmetric).

    I think asymmetry is getting progressively worse anyway. H.264 (software) encoding is practical, H.265 is less so, and H.266/AV1 just isn’t. Yet decoding didn’t get 100 times slower from H.264 to H.266 like encoding.

  4. Kostya says:

    That’s why I said “smarter”—it may employ motion compensation, do vector quantisation of DCT coefficients (maybe even with escape coding of rare but very distinct blocks in order to both preserve quality and keep codebooks cleaner). In either case I lack the qualities for designing a good format.

    The problem with modern codecs is combinatorial explosion caused by the fact that adding a new feature/coding tool increases the search space proportionally to the number of modes it introduces and despite the heuristics the total amount of variants to try increases at least by several percent. What I meant is about inherently asymmetric methods: i.e. restoring a block of a codebook is trivial compared to any kind of performing vector quantisation and searching for a matching block. And most of the operations in the modern codecs (like block transform or coding) are computationally the same or comparable, you spend most of the time selecting which coding tool and mode to apply.