With the addition of VBLE decoder I thought once again about codecs and how they are written.
Lossless Video Codecs
There are two approaches:
- Take a frame, apply one or two general compression schemes to it. Can be zlib, RLE+zlib or motion compensation from previous frame + zlib.
- Discover spatial prediction (usually from left neighbour or median) and add some coding for residues. HuffYUV, Lagarith, UtVideo, VBLE, LOCO, FFV1, whatever.
Lots of people try it, find that their codec is faster/compresses better than HuffYUV and release results. Usually those codecs don’t live long and the only bad thing about it is they being released to public in the first place.
Lossy Video Codecs
The codecs are usually more complex, so there are less of them. But there are more ways to create one.
- lossily quantise raw data or DCT output Every self-respecting company producing frame grabbing cards has written such codec.
- take a draft of some standard codec and base your work on it That’s how we got Window$ Media, R3al and Off2 video codecs.
- another approach to compression like vector quantisation, binary or quad tree decomposition, object-oriented representation (though this one is mostly used in screen capturing codecs), etc.
The main problem with these codecs is achieving good compression parameters without much hassle. For example, libavcodec MPEG-4 encoder may be the best around here but (like Soviet machinery) one has to work real hard to find out which parameters he/she needs to set to which values to get good compression. That’s the reason why people often choose Xvid instead.
Lossless Audio Codecs
There is one approach to those: add lots of crazy filtering (usually several chained filters) and equally crazy coding of residues. There you got it. Simple filters = faster compression, complex filters = slightly better compression with significantly longer compression times.
Last abstract from lossless video codecs applies to audio as well.
Lossy Audio Codecs
Those appear not too often because it’s very hard to satisfy everybody’s ears. Thus (IMO) it’s mostly limited to speech codec development. And there’s Xiph of course.