Final info about WMV3/VC-1 variants.

November 30th, 2006

Known fourccs:

  • WMV3
  • WMVP
  • WVP2
  • WMVA
  • WVC1

Known meaning of flags:

  • RES_SM: really that’s two flags, RES_SM&2 indicated 411 interlaced mode and RES_SM&1 indicates sprite mode(or so it seems). These flags are mostly used by WMVP and WVP2
  • RES_X8: I-frames may be packed with X8 algo. This also means that each I-frame has additional bit in frame header signalling if current frame should be encoded with X8 or with standard encoding scheme. Used in WMV3 Complex Profile.
  • RES_FASTTX: still unknown, somehow interferes with motion compensation (at least it does not affect bitstream)
  • RES_TRANSTAB: each macroblock should have its own DC/AC table index, not one global stored in the frame header.
  • RES_RTM: Old version of WMV3 with different P-frame coding mode

X8 frames use their own Huffman codes (code lengths are stored as nibbles somewhere in bitstream) and also blocks looks like to be decoded in this way:

for(y=0;y

for(x=0;x< mb_width*2;x++){

decode luma block

if( x & y & 1)

decode both chroma blocks

}

}

Some news

November 29th, 2006

Ever heard of codec with fourcc WMVP? That is another Windows Media variant designed for slideshows, video albums, ets. Why I have got interested in it? WMV3 Complex Profile (aka Ye Olde Undocumented and Incompatible Advanced Profile) uses exactly the same coding method to store I-frames when flag RES_X8 is set (if it is not set then VC-1 Main Profile decoder may decode that movies).

While standard decoder output mostly garbage, some blocks are decoded correctly thus I hope it’ll be possible to add support for both WMVP and WMV3 CP.

Fraps: path of trial and error

November 5th, 2006

I’ve just committed fully working (I hope) support for newer Fraps videos. While RE’ing that stuff I made two big mistakes: misunderstood bit reader to read only 31 bits from 32-bit word (which is not correct) and that Fraps version 4 is not identical to Fraps version 2. All of these is corrected now.

And now is a new category that Mike forgot to add to MultimediaWiki (hint, hint).

Supported screen capture codecs:

  • TechSmith codec (aka Ensharpen)
  • VMWare VMnc
  • DosBox ZMBV
  • Fraps
  • CamStudio

Unsupported screen capture codecs:

  • M$ Screen codecs MSS1 and MSS2
  • MSU Screen Capture Lossless Codec (SCLS)

Interesting – all those unsupported codecs use arithmetic coding (and as I suspect don’t work with palletized  images).

Some notes on Fraps

October 29th, 2006

Here is what I currently know about Fraps v2 (v1 is supported by FFmpeg):

  • all data is stored in 32-bit little-endian words
  • first word indicates codec version and flag 0x8000000 – delta frame
  • second word is zero
  • next word is ‘FPSx’ magic
  • next three words are offsets in frame (from position 8) to packed YUV planes
  • each plane starts with 256 words – frequencies for each possible value
  • the rest of data  is bits packed in words
  • each bit looks like an input to state machine which outputs pair (value, skip) and changes its internal state
  • the rest is still needs to be figured out

Recent developments

October 29th, 2006

Here is a quick list of what has been done recently:

  • LZW decoding in TIFF
  • 16-bit grayscale limited support (decoding PNG, read/write PGM and JPEG-LS but the latter is a bit buggy)
  • Some progress on Intel Music Codec decoder

Some progress

October 15th, 2006

Ah, I still remember those days when FFmpeg got one new decoder once week or two.

Mike once mentioned here that one coder searches for ideas to implement. My situation is rather complimentary – I hardly manage to fullfill all requests from other people. Recently I’ve added two image decoders to FFmpeg – Targa (.tga) and TIFF. Both of them were done by request. The same situation was with Worms Video.

WavPack decoder was written by my own wish but there are already some people asking to write MPEG-4 ALS and Monkey’s Audio decoders. While the later is not possible to implement right now, I’ll implement both MP4 ALS decoder and encoder.

Variety of lossless audio codecs

September 23rd, 2006

There are currently 14 lossless audio codecs mentioned on MultiMedia Wiki page (look here for further links):

  • Proprietary (Apple Lossless, Meridian Lossless Packing, Real Lossless, WMA Lossless)
  • Closed source (LA, LPAC, LTAC, OptimFROG, RK Audio)
  • Open source (Bonk, FLAC, MPEG-4 ALS, Monkey’s Audio, Shorten, TrueAudio, WavPack)

FFmpeg currently has decoders for Bonk, FLAC, Shorten, TrueAudio and Apple Lossless. So, there are at least MPEG-4 ALS, Monkey’s Audio and WavPack decoders can be added.

I will work on WavPack decoder and ALS (I hope standard will appear soon). What about Monkey’s Audio? Yes, it’s popular but it has following difficulties for implementation:

  1. It has incredibly large  frame sizes (it may be more than one million samples) while competitors stick around 64k or less (hence the compression gain for MA). Current FFmpeg design cannot handle such frames.
  2. Source code is a mess – for almost every action there are at least several if(ver >= …) or if(ver< ...). Format is too unstable for me.

Well, I still hope it will be implemented some day.

VC-1 Simple/Main Profile: 93,8%

September 21st, 2006

OK, Now B-frames for simple and main profile work. Now there are only some exotic features left to implement as spatial downsampling and sync markers. And a lot of RE’ing to implement old WMV3 P-frames decoding and complex profile.

As for advanced profile, B-frames are still not implemented.

Codec Completeness: Report

September 7th, 2006

Today I’ve committed (I hope) the last patches to VMware Video decoder (videos looked rather funny without mouse pointer).

So, here are the stats for decoders:

  • VMware Video: 98% (some blocks are unknown but don’t interfere with decoding process)
  • VC-1 Simple/Main Profile: 75% (downscaled frames won’t be supported and B/BI frames are in progress)
  • VC-1 Advanced Profile: 40% (interlaced mode is not supported and BDU coding is not parsed – samples, anyone?)

A Work for the Weekend

September 5th, 2006

I got free Monday so I decided to spend some time on any small codec and took VMware codec used by VMware emulator to capture screen output. As Alex discovered, this is merely recorded session of RFB protocol (used in VNC applications) but slightly mangled.
My further investigation show that bytes 1-2 in big-endian order contain number of rectangles updated and if that value is equal to 8 then the first chunk is ServerInitialization struct with marker ‘WMVi’ inserted before pixel format data. Otherwise frame may contain one or several FramebufferUpdate messages with standard HexTile encoding.
I’ve committed decoder to FFmpeg and I hope to find more samples and clarify some details about it (there are seven markers from ‘WMVd’ to ‘WMVh’ but what they mark is known only for ‘WMVi’, palette mode is not tested too).
OOPS, I’ve just discovered three more samples at MPHQ and they are not decoded correctly, so I have some more work to do.