Archive for the ‘RealVideo’ Category

RealVideo6: Technical Description

Saturday, November 10th, 2018

This week I’ve started reading the binary specification for RealVideo 6 again (yes, I had a post on RV6 technical description a year ago) since now I have a goal of implementing a decoder for it. So here I’ll try to document the format with as many details as possible.

Small update: obviously I’m working on a decoder so when it’s more or less ready I’ll document it on The Wiki.

RMHD: REing Has Started!

Saturday, November 10th, 2018

So, after a long series of delays and procrastination (the common story in NihAV development), I’ve finally stared to work on adding support for RMHD to my project. There are two things to be done actually: add support for RMHD container (so that I can demux actual files) and obviously support for RealVideo 6 itself.

As for RMHD container support, this was quite easy. Some person has given me a file about a year ago so I could look at it and improve my existing RealMedia demuxer. As expected the format is about the same but a bit different: magic word in the header now is “.RMP” instead of “.RMF”, some chunks were upgraded to version 2 (previously they all were either version 0 or version 1) with some additional fields shoved in the middle (or maybe they’ve upgraded them to 64-bit). The WTFiest change is how audio header was written: binary fields tell it’s header version 5 (it is) but text identifier says “.rm4” which obviously comes from version 4 and the provided header size is too short compared to the conventional RMVB and actual header data. At last after some hacks added to my RealMedia demuxer I was able to decode AAC track from the file so only the video part remains.

For RV6 I have the binary specification with a lot of debug symbols left. Even though I’ve made a good progress on it, I expect CEmpeg to add the opensource decoder for it first. After all, if you search for strings in the from the official site you’ll see the function names that will make almost anybody recognizing them cringe:

C_hScale8To15(short*, int, unsigned char const*, short const*, short const*, int)
C_yuv2planeX(short const*, int, short const**, unsigned char*, int, unsigned char const*, int)
SSE2_yuv2planeX(short const*, int, short const**, unsigned char*, int, unsigned char const*, int)
SSE2_hScale8To15(short*, int, unsigned char const*, short const*, short const*, int)
yuv2planeX_8_c_opt(short const*, int, short const**, unsigned char*, int, unsigned char const*, int, int)

In case you belong to the blessed majority who do not know what those functions are, these are part of libswscale, a library for frame colourspace conversion and scaling licensed under LGPL. Which means pieces of it were borrowed and adapted for (YUV-only) frame scaling in RV6 decoder and thus it source code should be available under LGPL. And imperfect English messages in the same specification (“not support” consistently occurs throughout it) hint that it was quite probably Chinese developers responsible for this. Since CEmpeg has both the majority of libswscale developers and people who speak Chinese, it should be no problem for them to obtain the codes. Meanwhile I’ll continue my work because NIH in NihAV is there for a reason (that also reminds me I should port NAScale to Rust eventually).

Anyway, the actual description of RealVideo 6 deserves a separate post so I’ll dedicate the following post to it (but who would read that anyway?).

NihAV: Some Progress to Report!

Friday, August 24th, 2018

Finally the large chunk is finished: NihAV has finally got support for RealVideo 3 and 4!

Since I’ve learned a great deal more about codecs since the last time I wrote RealVideo 3/4 decoder (and specifications for both were leaked—they have mistakes but still clarify some things), I was able to write a new decoder that also seems to reconstruct frames better.

Some words on the design: I’ve split it into several parts as usual—common RV3/4 code, RV3/4 DSP, RV3 bitstream parser, RV3 DSP and RV4 bitstream parser and DSP. That’s the approach I’ve been using before and I’ll probably use it in future decoders as well. The only more or less interesting thing is how I did weighted motion compensation: instead of temporary buffer I allocate 16×16 frame that I use for storing temporary results and which is used later to average results (since motion compensation routines in RealVideo 3 and 4 differ while weighted averaging is the same it makes sense to split it into separate operation).

And now for the juicy part: benchmarks and performance. I’ve tested one of the RealVideo 4 trailers (namely swordfish.rmvb) and avconv -threads 1 -cpuflags 0 decodes it in 15 seconds, nihav-tool needs almost 25.

RMHD: A More Detailed Look

Saturday, October 14th, 2017

Two years ago I’ve made some predictions about upcoming RealMedia HD and little I knew that it was finished in 2015! So finally I’ve had a look at it and here are some details.

First of all, it seems to be China-oriented since the only version with RMHD support I was able to find (even from US site) was Chinese version of RealMedia player (stream in peace…). And since it’s China it bundles CEmpeg libraries with RMHD support built-in. Good luck obtaining the modified source though.

The actual decoder was in a separate library though with the usual interface for RealVideo decoders and obviously I could not resist looking inside.

It turns out that RMHD corresponds to RealVideo 11 or RV60. Either they thought it’s too advanced to be merely RV50 or NGV was intended to be RV50 but they’ve buried it in the same grave with MPEG-3 and such.

Anyway, it’s time for juicy technical details.

RV60 is based on ITU H.EVC or its draft. It is oriented on multi-threading decoding and they have a lot of crap cut out and thrown away and I fully approve that. It’s the problem with many standardised codecs: you have so much flexibility in configuring coding parameters that you have to invent special objects to signal coding parameters for the following group of frames unless you want to waste 10% of bitrate on them in every slice header; and then you invent profiles because not all of the features can be supported by existing hardware (for example, because they’ve not been added to the standard yet). RV60 has rather simple frame header and coding units are always size 64 and they seem to comprise all three planes instead of coding planes separately.

The biggest disappointment is motion compensation of course. RV2 had 1/2-pel MC, RV3 had 1/3-pel MC, RV4 had 1/4-pel MC. I obviously expected 1/5-pel MC for RV5 but instead they’ve stopped on 1/4-pel MC (with the bog standard 1 -5 20 20 -5 1 filter for luma and bilinear interpolation for chroma too).

Spatial (aka intra) prediction is very close to H.EVC as well.

Transforms are present in 4×4, 8×8 and 16×16 sizes that are some poor integer approximations of DCT.

And now for the juiciest part: coefficients coding! Coefficients are coded with with lots of context-adaptive codebooks, for intra/inter, luma/chroma and various quantisers. And since it’s RealVideo and not Thor (and its Norwegian developer does not seem to work any more on it) all codebooks are static (total over 32k entries) and stored in compact and obfuscated form. Compact means that the description has only code lengths packed into nibbles and obfuscated means they were XORed with a string containing name and mail of the guy who probably generated those codebooks (this reminds me a bit of Cineform HD and Sierra AGI resources) and it makes me shout “RealLy!? What were you trying to achieve with that?“. I’ll look closer at actual coefficient coding a bit later (or significantly later—when I have a desire to do so) but so far it looks like the coefficients are coded in 4×4 subblocks but in general following H.EVC coding scheme.

Deblocking seems to be present and depends on block size. SAO is not present it seems (and I don’t miss it either).

There seems to be only three frame types (no RADL-, WTFL- or AltGr-frames) but the frames may have multiple references.

Overall, my predictions turned out to be mostly true. I should be surprised, I guess.

I’m yet to see any real samples too but this makes it one of the best H.EVC-based codecs (better than actual H.EVC or VPix—because nobody cares about it) so there’s nothing much to complain about.

P.S. I’ve working RealVideo 1 decoder in NihAV already so maybe the first opensource decoder for RV60 will be in NihAV too.

Predicting NGV^W RMHD

Sunday, July 19th, 2015

Here’s an occasional prediction how RMHD should look like knowing nothing about it beside press release claims.

  • Base standard — for RV 1 and 2 it was H.263, for RV 3 and 4 it was H.264. Obviously, RMHD and RMUHD should be based on H.265;
  • MV precision — RV 2 had ½-pel MV, RV 3 had ?-pel MV, RV 4 had ¼-pel MV. Obviously, RMHD will have ?-pel MV. Or still ¼-pel because H.265 has not improved MV precision compared to H.264;
  • Bitstream coding — usually that one is kept from previous generation of ripoff codec. Thus, H.265 keeps decoding VLCs further compressed with CABAC, AVS2 (aka HEVS) keeps doing the same with its own coder, VPx using range coder from VP<x-1> and static probabilities Huffman codes. RMHD is supposed to have context-dependent Huffman tables with some bitcoder following it. I.e. determine bitcode from element neighbours and then code each bit of it using some context-adaptive coder (and add some context-dependency somewhere too).
  • Special features — probably none, it will just follow the standard in codec design and the main difference will be in coefficients coding. There’s a chance they’ll build in some scalability feature though.

Let’s live and see what RMHD will really be. It would be nice if none of these predictions is correct.

A Bit More about UnclearVideo

Saturday, April 12th, 2014

So, in the course of cleaning-up program I’ve looked at ClearVideo again.

Intraframes decoding has been REd long time ago—it’s merely DCT-coded blocks. They happen quite often and it’s possible to watch it somehow.

Interframes coding is what makes this codec unique. And if you expected real fractal coding then you’re wrong (and that’s expected—processing power needed to restore even 320×240 frame from seed by applying iterations might be too much even for today). Instead you have quadtree with square blocks. Do you know what information it stores there? Subtree flags (i.e. if you need to divide it further), motion vector and brightness control. So the block is copied and its brightness might be adjusted too, that’s all. There’s no residue coding. And tree information can be coded in four different ways (I’ve seen four out of about six different tree decoding functions being employed), properties are coded with variable-length codes that I fear may be autogenerated.

So what prevents me from finishing it? A very horrible design. It’s written in Very Objective C++ dialect. Every piece of code is wrapped into its own class with possible overloading, so instead of calling a function you retrieve class pointer from some subclass of current context and invoke function from vtable there. Which will not be the real function but rather a thunk that will jump to the real function (even better, some functions calling themselves do it through that thunk too). And yet they use global variables! This is impossible to analyse with static analysis (like I mostly do), it’s nightmare for debugger as well (but a hack to MPlayer loader that displayed calls and indirect jumps was very useful here) and I suspect it won’t get much better even if I get original source code (which I’ll never get of course).

A small example of design: tree decoding uses several GetBitContexts stored in more generic TreeDecodingContext, which is a subclass to FrameType2 context, which is a subclass of DecoderContext. And I’m pretty sure I forgot some levels of indirections in-between.

Time to give up? Time to give up.

Some details on ClearVideo

Saturday, August 11th, 2012

ClearVideo was the most widespread codec in the old days. One cannot name any other codec that was present in AVI, MOV and RealMedia simultaneously. Oh, and it presumably uses fractals.

Recently I’ve discovered one rather funny thing: ClearVideo intraframes are very simply coded. You have standard 16×16 macroblocks, some DCT, one static codeset for DC, one static codeset for AC. It’s simpler than baseline JPEG (well, maybe except for the fact there’s a set of flags signalling if ACs are present in the block at all).

My main problem with it was that I could not find out where are those codes are stored or generated. Well, it turns out that it’s stored in binary form in wrapper DLLs in resources section (so if you use some resource explorer on it you can find the codes in resource TABLE/DCT or modify RAW/*BRAND to remove that annoying watermark but who cares?).

Maybe one day I’ll deal with interframes and RealMedia demuxer support…

Some notes on codecs I’d like to RE but don’t have time to do so

Saturday, August 27th, 2011

There are some codecs that I’d like to RE (mostly for completeness sake) but I don’t have time for that.

Intel Audio Codec

This one seems to be a lot like its predecessor IMC (Intel Music Codec), it even codes coefficients the same way but with different codebooks. I’ve tried to hack IMC decoder to make it use proper tables but it still decodes garbage.

Along with Indeo4 decoder it would make our Intel codec family complete, but unfortunately we have decoder for neither.


The codec that was present in AVI, QT and RealMedia. My investigations showed that it was not-so-fractal codec, it still codes blocks with DCT and even does that in simpler fashion than H.263. Though a patent assigned to Iterated System describes what can be the base of this codec: DCT-based codec that uses fractal search to determine the best code for the current block or something like this. Maybe that’s the reason why there are no Huffman tables in decoder while it obviously uses some.


That’s rather special lossless codec that stands aside from other RealMedia codecs: the file format was altered for that codec (so far I’ve seen only standalone RALF files, not, say, RV40+RALF).

It looks like the codec is rather simple and employs context-dependent codes instead of generic ones. I remember finding about eight hundred static Huffman tables in decoder for that purpose.

Also codec developers were very grateful to their source of inspiration, that’s why codec IS is “LSD:”.

WMA Lossless

Nothing much to say about it. As I remember, it uses infinite impulse response filters for compression and least squares method for finding (and maybe updating) filter coefficients. Should be not so hard to RE but nobody bothered so far.

M$ Screen 1 and 2 (aka WM Screen)

I’ve dabbled in REing MSS1, not MSS2 (which was later relabeled as WM Screen) but they should be related.

MSS1 was rather simple screen codec based on classic arithmetic coding (with adaptive models too IIRC) and binary partitioning. So decoding process was simple: get point for subframe division (horizontal and vertical) and modes for decoding those partitions (fill, skip, subdivide).

VoxWare MetaSound

This codec is obviously based on TwinVQ, it even has similar huge tables for different samplerates and bitrates and I found almost the same header reading code.

In conclusion I want to say that if somebody wants to RE those codecs he’ll be more than welcome (especially for Apple ProRes but I don’t care about it much).

That is not dead which can eternal buffer, And with strange aeons RALF may be implemented.

Sunday, August 7th, 2011

And now for something completely different, a post about our favourite eldritch abomination (the word “buffering” should tell it all).

I’ve decided to spend some time on RealVideo codecs.

  • RM demuxer — the one in Libav is based on some scarce guesswork and does many things incorrectly (reporting incorrect FPS, reporting PTS while container stores only DTS, ignoring interleavers, selecting video codec by version reported in its extradata, etc.). I hope improve it a bit or kill trying.
  • RV1/2 — our decoder is based mostly on guesswork, I’ve looked at it and tried to correct header parsing at least. For these codecs internal version number actually matters for bitstream reading. Another funny fact: Real didn’t develop RV2 fully by themselves, they based it on Intel ILVC sources (even some header from Helix distribution says it and I wondered why some functions in RV2 have Hive* names like in Indeo 4 and 5). Also decoder sports some artefacts related to motion vectors outside picture boundaries, maybe it will be fixed too.
  • RV3 — there was a problem with chroma drift, it’s finally fixed.
  • RV4 — there’s well-known problem caused by lack of weighted MC. I’ve finally implemented it, after some cleaning and related work on RV3/4 decoder it should appear in Libav soon.

P.S. So far there are two codecs not supported by Libav, RALF (yet another pointless lossless audio format in its own special container too, or in slightly twisted RM at least) and ClearVideo (yes, it was possible to have non-RV there, anybody remembers that fractal codec?). While RALF is unlikely to be implemented ever (I think I wrote about it once), ClearVideo might be supported eventually but don’t hold your breath on it.

RV40 is in

Tuesday, December 2nd, 2008

As you may know from other place, FFmpeg got RV40 decoder. There’s still some hope that FFmpeg will get RV30 decoder as well (it needs loop filter and squishing some bugs).

Some notes about performance:

  • PPC G4 1.42GHz — on par
  • Celeron 600MHz (inside ASUS Eee) — significantly slower (2 minutes of the same source decoded in 64 and 82 seconds by binary and native decoders respectively)

When I switch motion compensation functions from C implementations to optimised H.264 counterparts (they are slightly different so the picture quality gets worse) native decoder becomes faster than binary one by several percents on x86 and even faster on PPC. Conclusion: if you want fast decoding then submit SIMD versions of motion compensation functions.