Archive for the ‘RealVideo’ Category

RV: a small update

Sunday, November 23rd, 2008

Hereby I declare that my RV40 decoder changed its status from “Well, it’s better than nothing” to “Good enough”. While there are still problems with chroma and jitter in B-frames due to wrong motion vectors prediction, luma decoding is bitexact on I- and P-frames.
I hope to weed them out and have decoding enabled in FFmpeg before next year. Maybe RV30 too.

For those who ask specs on RealVideo:

???????RealVideo

I hope the message is clear enough.

RV3/4 decoders present state: stalled again

Tuesday, November 4th, 2008

I’ve been very busy with the things outside FFmpeg yet I’ve managed to do something on RV3/4 decoders too:

  • Found and fixed an old bug with quantisation for DC coefficients.
  • Cleaned a bit RV4 loop filter.
  • Fixed chroma MC bug in RV3 decoder.
  • B-frames motion vectors are now closer to the reality in RV3.

What is missing:

  • RV3 loop filter
  • correct RV3 motion vectors calculation
  • RV4 motion compensation incompatibilities

The main problem is that I don’t quite understand why it’s working in the way it works and (in some parts) how it works. Hopefully it will be clearer next time I’ll look at it.

A bit more

Friday, August 8th, 2008

With low-pass filter my AAC encoder is more or less feature-complete. Of course there’s still more room for improvements but it’s pretty fine now. I’d like to submit it for review but it depends on some parts of AAC decoder and it’s still under review :-(. So I don’t have much to do until then.

So I switched to last GSoC task and hacked again at RV40 loop filter. Well, filter invoking pattern is almost there and I’ve fixed several bugs in actual filtering code. Bit it’s not there yet. Maybe in a month it will be so if AAC encoder won’t take all my time again.

News + Extra

Sunday, August 3rd, 2008

AAC front: to compete with other encoders I have to implement low-pass filter. Benjamin suggested Butterworth filter, so I will try it next week. Hopefully that will be the last big feature to do.

RV front: looks like deblocking pattern is generated from comparing motion vectors, if the difference for subblocks is greater than 3, then edge between them is scheduled for loop filtering. Don’t expect working loop filter implementation too soon though, I still have to deal with AAC encoder and it’s more important.

Extra: I’ve finally decided to buy ASUS Eee, it was easy thing to do – there’s only one model (Eee 701 4G with Win XP installed) for about the same price of four hundred bucks (maybe $450 in greedy shops). So the first thing I did with it was installing Linux and tearing down that stupid “Designed for Windows” label (which was surprisingly easy thing to do and left no marks on laptop surface).

Now here are complains about Ubuntu Eee (I don’t have USB DVD drive and Xandros hasn’t worked from USB flash drive for me): it requires some hacking of system configuration to make it work (like shutdown properly) but that I can live with, but the braindead thing is that gcc is installed (why?) without any development header or library, so you can’t compile even “Hello, world!” program. Both of those issues are resolved, so I just need to make this toy more useful to me 🙂

Turtles All the Way

Sunday, July 20th, 2008

Just in time I though I’ve fully understood RV4 loop filter. It uses both coded block pattern and some other pattern. I thought it was CBP from the previous frame, but it turned out to be some special deblocking pattern calculated for each block in interframes after decoding that block. That calculation is easy – it just selects a set of subblocks to check, compare some values and if the difference is less than 3 then set a bit in deblock pattern. Now the only thing left is to find out is where those values come from.

Again and again on RV40 loop filter

Tuesday, July 15th, 2008

I’ve mostly understood how RV40 loop filter works.

Just not to forget main principles I document it here (this blog was created for such things after all).

  • CBPs from left, top and bottom neighbours are used in filter, and if frame type is interframe then CBPs for those blocks in that reference frame are used as well.
  • There are two actual filter types – weak and strong, both are described in H.264 drafts.
  • Edges in subblock are filtered in the next order: bottom, left, top. Top edge is filtered only for the subblocks in the first row.
  • There are many filter parameters passed: dither argument (for strong filter, depends on subblock position), two thresholds taken from ClipTable, threshold taken from alpha_tab, threshold taken from beta_tab and the same value multiplied by 3 or 4 (four is for Y plane filters in not extremely big pictures).
  • The problem was to determine what ClipTable parameters should be used, as it has an additional dimension, more on it below.

There are seven values taken from ClipTable total:

  1. ClipTable[0][current block quantiser]
  2. ClipTable[2][current block quantiser]
  3. ClipTable[2][global quantiser set in header]
  4. ClipTable[x][current block quantiser]
  5. ClipTable[x][top neighbour quantiser]
  6. ClipTable[x][left neighbour quantiser]
  7. ClipTable[x][bottom neighbour quantiser]

That x value is 2 for the intra block types and P-frame interblock with DCs coded separately, 1 otherwise.
As I understand, ClipTable[x][current block quantiser] is used by default and other valuer are used for corner cases (subblock on the side of the edge is uncoded, belongs to another macroblock or does not exist at all).

I should look at H.264 loop filter description (thanks to all who sent me the pointers to the book by Iain Richardson), it seems suspiciously similar.

Again on RV40 loop filter

Thursday, July 10th, 2008

While work on AAC encoder is slowly progressing (now it’s mostly psychoacoustics left to do and maybe HE-AAC if somebody will convince me), I’m looking at side tasks to make my life a bit more colourful.
For now those tasks are writing SSE2 optimization for Monkey’s Audio decoder (and that is the first piece of SIMD assembly I’ve ever written) and working on RV40 loop filter.
To give people false hope, it’s more understandable by now. Only one function argument is not obvious. And Dark Shikari, you were wrong – RV40 is 99,5% alike with H264 draft (not 99% you said), as loop filter is suspiciously similar to H264 one.

RV: present state

Sunday, May 18th, 2008

If you are interested in what’s going with my RV decoder from GSoC 2007 then here are your answers.

What works:

  • RV30 decoding mostly works
  • RV40 decoding mostly works
  • Pictures are quite recognizable

What needs to be resolved:

  • RV40 loop filter
  • RV30 loop filter (a bit easier)
  • RV30 motion vectors in B-frames (sometimes they are a bit jumpy)
  • RV30 chroma problems (colours are always moving to the upper left corner of the frame – incorrect rounding?)
  • RV30 slice uniting problem (some splitted slices should be united by decoder – at least I know how and when to do this)

If you want to help with loop filter then loop at
loop filter work scheme (SVG, ~128Kb) and give your proposals on how it works.
Legend (macroblock is 4×4 subblocks, no borders as they will ruin this scheme):

  • numbers at the top and left eddge – macroblock numbers
  • black lines – subblock edges where loop filtering took place
  • hex number at the top left corner of macroblock – coded block pattern, it’s red for intra types macroblocks and for P macroblocks with DC coeffs coded separately
  • blue square – coded subblock

Any suggestions (and pointers to the information about H.264 loop filtering explained clearer than in standard) are welcome.

I’d like to finish it before starting my work on AAC encoder…

BTW, you can use ffmpeg-rv.patch from soc/rv40 repository to enable RV30/40 decoding in ffmpeg.

RV30/40 – status

Tuesday, March 4th, 2008

Just for curious people who really want to know what’s happening in rv30/40 decoder for libavcodec implementation.

I have implemented all main parts of decoder including loop filters, but some of the finer details are missing like parameters that should be passed to loop filters or motion vector prediction. This results into jerky picture in case of B-frames present (and they are often present) and dirty tails after moving objects. See for yourself.

Screenshot of decoder performance

Some example of my rv40 decoder work

Currently the work on this decoder is stalled. In order to fix bugs I have to verify decoded data against reference decoder and that’s not easy. It takes a whole night to get the needed debug data for 70 frames from 320×240 video on my ThinkPad 390. And it takes a lot of space too considering I have about a hundred megabytes of free disk space there.

I want to obtain a small (I don’t have enough space to fit standard desktop), low-power (less than 20-30 Wt power consumption, power blackouts are quite common here) x86-based computer. I know they exist in many variations, but it’s next to impossible to buy one here.

Well, I will finish both encoders. Eventually. Especially if I have enough content to test it with – most files I’ve met (including samples.mplayerhq.hu) are either Japanese TV recordings (anime often bears Chinese subtitles) or Simpsons with crappy translation into Russian (for example, “you rock” was translated as it if was “you are a rock”). Oh, there are also some movie trailers but I fear the need to watch them won’t motivate even Mike.

If you are curious why I chose that shot. I believe it features the character main MPlayer server is named after.

Some details on RV30/40

Wednesday, September 12th, 2007

RV8 or RV30 is close to earlier H.264 drafts and Sorenson Video 3, because they both use one variable-length code (Golomb codes in case of SVQ3, something special in RV30/RV40) to represent macroblock information (type, intra prediction modes). RV40 uses this code only to represent motion vectors and number of macroblocks to skip, macroblock type and intra prediction modes are coded with variable-length codes chosen from the set of them depending on context.

Here are the main differences from H.264:

  1. Different bitstream. Special codes for elements and own substitution of Golomb code.
  2. Intra prediction types are specified only for luma subblocks, chroma subblocks use some of them for prediction too.
  3. Intra prediction modes are slightly different and some of them require additional context (down left neighbour).
  4. Bidirectionally predicted blocks in B-frames do not use motion vectors from the previous/next frame in motion vector prediction while using them in motion compensation.
  5. Intra prediction for intra blocks in interframes is performed even when neighbouring blocks are not intra blocks.

Main differences between RV30 and RV40:

  1. Bitstream syntax and different codes (RV30 is easier in this matter).
  2. RV30 does not have 8×16 and 16×8 motion compensation modes (while 8×8 mode exists).
  3. Different motion vector prediction algorithm.
  4. B-frames in RV30 do not contain some variant of bidirectionally predicted blocks which RV40 has.

And now here is the list of things I didn’t like in RV30/40:

  1. Slice header does not contain number of macroblocks coded in this title. While it saves whopping 6-13 bits per slice (and frames usually have less than a dozen of slices, usually two or three), it gives unnecessary pain to implementor.
  2. Vertical left intra prediction method uses down left neighbour pixels in calculation for one insignificant pixel (it’s insignificant because it does not affect further intra prediction).
  3. Motion vector prediction is a bit complicated too.
  4. And, of course, the lack of good documentation