A Codec Family Proposal

There are enough general use standardised codecs, there’s even VPx family for those who want more. But there are not enough niche codecs with free/open specifications.

One of such niche codecs would be an intermediate codec. It’s suitable for capturing and quick editing of video material. Main requirements are modest compression rate and fast processing (scalable is a plus too). Maybe SMPTE VC-5 will be the answer, maybe Ogg Chloe, maybe something completely different. Let’s discuss it some other time.

Another niche codec that desperately needs an open standard is screen video codec. Such codec may be also used for recording webcasts, presentations and such. And here I’d like to discuss a whole family of such codecs based on the same coding principles.

It makes sense to make codec fast by employing multithreading where possible. That’s why frame should be divided into tiles that should be not so large and not so small, maybe 192×128 pixels or so.

Each tile should be coded independently, preferably its distinct features coded separately too. It makes sense to separate tile data into smooth features (like gradients and real life pictures) and sharp transitions (like text and UI elements). Let’s call the former a natural layer and the latter a synthetic layer. We’ll need a mask to tell which layer to use for the current pixel too. And using these main blocks and employing different coding methods we can make a whole family of codecs.

Here’s the list of example codecs (with a random FOURCC assigned):

  • J-B0 — employ JPEG for natural layer and GIFPNG for mask and synthetic layer coding;
  • J-B1 — employ Snow for natural layer coding and FFV1 for synthetic layer coding;
  • J-B2 — employ JPEG-2000 for natural layer coding, JBIG for mask coding and something like PPM modeller for synthetic layer;
  • J-BG — employ WebP for natural layer and WebP LL for synthetic layer.

As one can see, it’s rather easy to build such codec since all coding blocks are there and only natural/synthetic layer separation might need a bit of research. I see no reasons why, say, VLC can’t use it for recording and streaming desktop for e.g. virtual meeting.

2 Responses to “A Codec Family Proposal”

  1. kurosu says:

    An hevc extension for screen content is planned. How practical it will end up is another matter.

  2. ethernode says:

    I completely agree, right now the only CPU-friendly intermediate codec is good old mjpeg; i also have relatively interesting results with mp4p2 but it does not scale well (especially against noise or at bitrates > 5 Mbit/s). x264 has good noise and high framerates robustness, but it eats so much CPU that as intermediate codec it’s a bit stupid (even all-I) — not to mention that codecs like H.264 are basically designed to fail if the picture is really complex (either the encoding complexity makes it impossible to encode or it looks like crap).

    But enter the 1080p60 world, choose a proper torture video, and even mjpeg begins to fail on a modern 3 Ghz i7 processor… So yeah, multi-threading jpeg encoding can fix that but what then ? Is hardware acceleration the only suitable way to encode an intermediate codec (e.g. high bitrate H.264 or VP8) ?