A New Box

March 22nd, 2008

I’ve finally got an x86 box. Thanks to immeasurable efforts of Henning NorΓ©n who sent it to me.

On this poor quality photo you can see it. This box contains Pico-ITX based computer with more memory than on my other boxes together.

My New Box

The only bad moment is that though I’ve assembled it (and I believe assembled it properly) I can’t make it start up. DC converter (a small board to the right) produces voltages all right, so it’s not a power failure. Hopefully I will resolve this issue in a week, install Linux and resume my work on codecs development.

RV30/40 – status

March 4th, 2008

Just for curious people who really want to know what’s happening in rv30/40 decoder for libavcodec implementation.

I have implemented all main parts of decoder including loop filters, but some of the finer details are missing like parameters that should be passed to loop filters or motion vector prediction. This results into jerky picture in case of B-frames present (and they are often present) and dirty tails after moving objects. See for yourself.

Screenshot of decoder performance

Some example of my rv40 decoder work

Currently the work on this decoder is stalled. In order to fix bugs I have to verify decoded data against reference decoder and that’s not easy. It takes a whole night to get the needed debug data for 70 frames from 320×240 video on my ThinkPad 390. And it takes a lot of space too considering I have about a hundred megabytes of free disk space there.

I want to obtain a small (I don’t have enough space to fit standard desktop), low-power (less than 20-30 Wt power consumption, power blackouts are quite common here) x86-based computer. I know they exist in many variations, but it’s next to impossible to buy one here.

Well, I will finish both encoders. Eventually. Especially if I have enough content to test it with – most files I’ve met (including samples.mplayerhq.hu) are either Japanese TV recordings (anime often bears Chinese subtitles) or Simpsons with crappy translation into Russian (for example, “you rock” was translated as it if was “you are a rock”). Oh, there are also some movie trailers but I fear the need to watch them won’t motivate even Mike.

If you are curious why I chose that shot. I believe it features the character main MPlayer server is named after.

Checkpoint

January 6th, 2008

Let’s see what was done with my plans for 2007:

  • Make some lossless audio decoders – no luck (well, I’ve helped a bit to include Monkey’s Audio decoder to ffmpeg)
  • Implement missing VC-1 Simple/Main profile features – no luck (at least WMV3 Complex Profile is almost completely supported)
  • Implement VC-1 Advanced profile interlaced mode – no luck
  • Help with some other projects – DCA implementation and maybe even finish RE for Xan v4 – partially done
  • Write JPEG-2000 decoder – no luck, some people convinced me to write RV3/4 decoder instead and I’m still waiting for the specs πŸ™

So it was extremely unlucky year in fulfillment of goals.

What have been done instead:

  • Musepack SV8 support
  • Helped with some projects (DCA decoder, Monkey’s Audio decoder, few game formats)
  • Some WMV3/VC-1 fixes
  • A bit of work on RV3/4 decoder

I hope this year will be better.

Some Thought on Future FFmpeg Audio API

November 22nd, 2007

After some discussions on IRC I’ve participated I’d like to present here for future discussion.

  1. Audio API should reflect video API as much as possible. Now decoder outputs 16-bit native-endian audio into raw buffer.
  2. Introduce audio formats. I’d like to be able to decode old 8-bit codec into bytes, newer 24-bit audio into 32-bit ints, floats for other codecs if they need it, etc.
  3. Planar format for multichannel codecs. It will simplify downmixing and channel reordering. (This is not my idea but it is worth mentioning)
  4. Swscaler-like structure for format handling and negotiations between audio filters.
  5. Block-based audio processing. Each audio should be operated as a multiple of blocks with fixed number of samples (like video is operated by frames and rarely by slices). Why not always by single block? Because some formats throw chunks with multiple blocks to decode (Monkey Audio, Musepack SV8) and some have too small blocks that cause too much overhead to process them by one at time (most speech codecs and (AD)PCM). This is just a bit stricter than current scheme.

Now, who wants to implement this?

X8intra is there!

November 10th, 2007

Now we have X8intra frames support in ffmpeg! Mike has already expressed his joy in his blog. I think anime fans who tried to play WMV3 in AVI would also be glad.

Why this has not been done earlier? Well, sheer disgust. The person who invented this scheme should be either fired or promoted to M$ CTO. Here is the list of reasons:

  • It is used for some keyframes coding so it cannot be skipped
  • Design is totally unlike anything standard
  • You should perform bitexact decoding. If your DCT produces slightly different results or you forgot about loop filtering then you won’t be able to decode picture properly. Hey, that’s utter crap by _any_ standard
  • It has made it to WMV3 too.

To put it mildly, X8Intra is an illegal offspring of JPEG and some early H.264 draft. It is mainly Huffman-coded 8×8 DCT-transformed blocks with spatial prediction and loop filtering. It does not have macroblocks like decent codecs. Spatial prediction has some directions and relies on previously decoded blocks and bits read depending on that. I have not seen anything with such a bad bitstream-transform dependency (in other codecs you can decode coefficients first and then perform image reconstructions but not here). X8Intra excrementum est(pardon my Latin).

Still I very grateful to the person named “someone” who did this. Otherwise I should clean this cesspool.

Multimedia-unrelated news

November 9th, 2007

I just had to post this – our Philharmonic presented own harpsichord. Several years ago when I first visited it I’ve listened to concert music with harpsichord but it was borrowed one and there was nothing comparable since that time.

The presentation went well and we were enjoying different music – from sonatas by Handel and Telemann, Johann Sebastian Bach concertos to Mozart to jazz improvisations and modern Ukrainian music (well, when composer plays on one of the instruments himself I consider it modern).

Looking forward for further listening (with hope that it would take less than a couple of years of waiting).

A Book on Multimedia

November 8th, 2007

I presume those interested in multimedia coding have heard of “Data Compression: The Complete Reference” by David Salomon. Personally I consider this book very good but maybe we should write our own book concentrating on multimedia only. Why? I have not seen books where video (and audio) compression is not merely outlined (like in most books on general data compression) or is not solely dedicated to some standard
(MPEG usually).

I gladly remember this book as it’s quite outdated but at least it covers many codec, container and even implementation issues (unfortunately, sound only))!

My proposal for book outline:

  • General multimedia concept (pixels, samples, PCM, DCT)
  • Audio compression
    • Simple time-domain codecs (DPCM, ADPCM)
    • Complex time-domain codecs (lossless mostly)
    • Speech codecs
    • MDCT-based codecs and friends
    • How to write a fast decoder and good encoder (or otherwise)
  • Image Compression
  • Video compression
    • Lossless coding
    • Game video codecs (who will write this?)
    • Modern standard and non-standard codecs
    • Implementation tips and tricks
    • Known codecs (implementation-wise) overview
  • Containers
    • Why making codecs dependent on custom container is idiotic πŸ™‚
    • File-based containers
    • Streaming containers

And I know where to get information ;-). Well, let’s see if this catches up.

Audiovisual debugger

November 4th, 2007

I have never though about FFplay in that way but it had struck me today that waveform visual display is one of the best ways to debug it.
Why?


FFplay

One of C.P.E. Bach’s Wurttemberg sonatas (a small excerpt, really)

Because it gives you those advantages:

  1. Noise hurts your eyes less than ears
  2. Some inaudible artifacts (like DC bias) are easily spottable
  3. Clipping and volume change is easily spottable too
  4. Stereo differences are easy to find
  5. It may give you some aesthetic pleasure πŸ˜‰

I must also add that most audio player have visualizers but they lack simplicity and usability of this 640×480 clean waveform rendering.

Another game format

October 17th, 2007

I remember playing game “Lost Vikings” by Silicon&Synapse (who was later renamed to Blizzard).


Lost Vikings 2
Lost vikings (again) and additional creatures. Endgame scene.

Who knew that it had a sequel? It was done by Beam Software and looks less appealing to me than original. But it has cutscenes and audio in its own format, which makes it a bit more interesting. A contributor, who wishes to remain anonymous, sent me his description of this format so I was able to implement demuxer and decoder for their video and it will be committed soon (I hope). Details of this format are already here.

Flowers

October 9th, 2007

As a follow-up to the theme set by Michael here are my few random shots of flowers (made in the Western Ukraine, several thousands kilometers from my home).

Read the rest of this entry »