On fruity MVS codec

I could be writing about RedHat video encoder I just finished or work on REing DiVID1x on Paul’s request, but this was earlier in my queue.

Apparently on iVNC protocol there’s an option to use a custom iCodec for that. Since I was asked to look at it, here are my preliminary findings (more detailed bitstream description will follow eventually).

So packet starts with a byte telling payload type (0 – intra frame, 1 – inter frame, 2 – custom quantisation matrices for luma and chroma, 64 bytes each). After that the rest of data follows.

Intra frames code a series of tiles with tile metadata and actual tile content being separated into different parts. Frame data starts with two DCT quantisers followed by 24-bit big-endian metadata part size, then there’s metadata, and finally it’s tile data.

Tile metadata codes 3-bit tile type and the number of tiles having that type (00001110 mean 1-15 tiles, 11110 means next 8-bit value plus sixteen, 111110 means next 15-bit value plus sixteen, 111110 means next 22-bit value plus sixteen). Tiles can be of the following types:

  • white tile—tile is completely filled white;
  • last match—previous(?) tile is copied;
  • upper match—tile above(?) is copied;
  • black and white—one bit per pixel (0 – black, 1 – white);
  • two-colour tile—almost the same but with two colours transmitted first (8-bit luma and 6-bit chroma values);
  • DCT—tile data is coded with ProRes-like DCT;
  • match tile—re-paint last recently used DCT tile;
  • cached tile—re-paint DCT tile with 16-bit index from LRU cache.

Inter frames start with two bytes telling the number of coded chroma coefficients and the rest is single bitstream with 2-bit tile type and whatever tile content is stored. Tile types are: skip, DCT, match tile, and cached tile. The first type should be obvious, the rest is probably the same as in intra frames.

Also frame data is supposed to end with "mvs\0" but I guess this matters only for people trying to write a compatible encoder (or checking that the data was decoded correctly).

See, it’s a rather simple codec, so hopefully I’ll clarify some things (like cache behaviour, YUV coefficients and actual DCT bitstream format), document it at The Wiki and move to something else.

Leave a Reply