Archive for the ‘Useless Rants’ Category

An Impression on Rhaetian Railways

Wednesday, April 25th, 2018

Since I don’t have enough time to visit proper country I went to bad substitute of Sweden that’s much more accessible—Switzerland (it should be obvious why I cannot call it poor or cheap substitute). Since it happened on Easter (April 1-2), the environment was resembling Sweden: snow, mountains, deer and log sheds. And of course I could ride trains in new locations!

Rhaetian railways is a narrow-gauge railway system in canton Graubünden (which symbol uncannily resembles the one from Gävle), a fractal part of Switzerland occupying its south-east corner (fractal in the sense that canton shape looks almost exactly like the shape of whole Switzerland). Trains run in a picturesque scenery with dreadful names like Fhtagn (or Ftan in Swiss-Cthüelsch) or SaaS (they really have a station with such name!), going up to the mountains (in 1-2 km above sea level range) and I spent couple of days travelling around.

But while the scenery is okay, the railways are some unholy mix of Berlin S-Bahn, Czech and German railways:

  • There are German ICEs running there all the way to Chur (so I could travel home without any transfers);
  • The tracks are curvy and trains are as slow as in Czechia (i.e. no matter where you go it will take you at least an hour or two to get there);
  • Prices are like in Czechia too except they use Swiss Francs instead of Czech Koruna—but numbers are about the same (so it seems I can ride with ICE here cheaper, faster and on much longer distance than with RhB);
  • Another thing like in Czechia: buying a ticket with a card involves 1,5€ surcharge. No such thing in Sweden;
  • Narrow-gauge trains are a weird mix themselves: they can put locomotive in the front of the train, in the end (maybe), in the middle (very common) or just couple a typical EMU with a number of conventional rail carriages (I’m not sure I’ve seen that anywhere else);
  • Weird station names: I can understand when you name a station after two places at once like Reichenau-Tamins (that’s common in Germany too) or even if you name it after the same place twice like Disentis/Mustér (it’s Confoederatio Helvetica, natives can’t agree on a single name for anything) but Tavanasa-Breil/Brigels is definitely too much (it’s a station between those two mentioned earlier BTW);
  • It’s afraid of snow: after even insignificant amount of snow they stop going on some routes: on my stay there the trains on Pontresina-Tirano and Disentis/Mustér­-Andermatt routes were cancelled for indefinite amount of days. In Germany trains are more punctual—if they are late they’re late for dozens of minutes, not days. And if something bad happens and trains can run some route for days then you can see information everywhere including how to get around and such. No such thing in Switzerland;
  • And another thing that’s taken from German S-Bahn is timetables and tickets. This requires a separate rant.

Overall, FFS or RhB is not very friendly to a traveller: you should have a definite idea where are you going to, when (at which time and such) and how (i.e. where to transfer) if you want to buy a ticket. For example, I was at the station Chur-West and wanted to go to Scuol-Tarasp. The ticket vending machine offered me to choose from three options: via Samedan, via Chur-Samedan (i.e. go first to Chur main station and from there to Samedan and then to Scuol) or via Vereina. The last option is actually a tunnel and not a station name!

In Germany when you travel with long distance trains you actually choose one of the provided connection possibilities (e.g. InterCity from A to B, RegioBahn from B to C and ICE from C to D or InterCity from A to E and then from E to D) or you can use the provided functionality for route planning even if you don’t buy a ticket. SBB ticket machines simply allow you to buy ticket from A to B maybe with cryptic route midpoint and that’s all! That’s exactly how German ticket vending machines for regional transport work. And there’s yet another point of annoyance: Swiss rail timetables fail to include arrival time for the final destination so if you care about it (like I sometimes do) you have to find it out via other means. It’s plain stupid.

Oh, and the snow-related problem: when you buy a ticket you can’t be sure the train will go there because the only cryptic warning I got is when ticket machine said my ticket will be valid on April 1st-April 9th period (and much later in the train too). In Germany it actually shows warnings when there’s some problem with a train or it’s cancelled entirely (since you can use it later). I actually had a situation when one segment of my travel was served by a train that broke down and I had to take another train later instead. So it feels like you should rather use smartphone and buy ticket online where you can see the actual route and warnings (and probably use bahn.de instead of cff.ch too where possible).

Overall, travelling with Rhaetian railways was both a pleasant and exciting experience in some aspects (i.e. when I was inside the train) and confusing and frustrating experience in others (i.e. when I actually tried to buy a ticket). They also boast how some parts of the system are the third railway in UNESCO World Heritage Railways (the second after India, I guess) and how picturesque some parts are (they are almost as interesting as Sauschwänzlebahn indeed) but as I’ve seen it all there’s no reason to return there (and the reliable source says there are better places in Switzerland to wait over heat waves too).

#chemicalexperiments Dough and Pancakes

Saturday, April 14th, 2018

Since I don’t have any urges to work on NihAV at this moment (big surprise, I know) I’ll talk about cooking instead.

Since I don’t know how to cook and never had any kind of culinary education, I divide dough into three main categories: puffy (the one that expands while baking), non-puffy (the one that keeps about the same volume) and runny (usually used for pancakes but we’ll talk about them later).

Non-puffy dough is the easiest to make: just mix flour and water (take either boiling water or very cold water for good results). Ideal for simple filling dishes like вареники or Karelian rice pasty (I made both and shall probably make again). The next level is to make so-called shortcrust pastry which is used for pies, quiches and such. Here you usually mix flour with some fat and/or filling (called shortening).

And there we have a variety of what to use for shortening:

  • classical recipes use butter—I’ve cooked stuff using it and it works fine except that it takes too much butter to my liking;
  • French people obviously prefer margarine (since it’s their invention)—I see no reason to try it;
  • Brits prefer some weird animal fat called suet—I feel queasy just thinking about it so it gets definitive no from me;
  • USians use chemically processed vegetable shortening; I’ve tried it once: ordered a can of Crisco shortening, followed the recipe for pie crust and the result is bad. I’d stick to other two recipes listed here. Fun fact: while searching for it on Amazon most offers were from sex shops where it’s apparently offered as a lubricant. I can see why—that stuff is sticky and slick and not fit for baking. Also since one of the sellers offers it along with various sweets (and what passes for them in the USA) I’ve ordered some of those and tried it—I was not impressed by that stuff either.
  • and finally there’s German variant that I find very good called Öl-Quark Teig (dough made from oil and quark—in this case lean homogenous cottage cheese). You mix flour with several spoonfuls of oil (you can choose different oil for different flavouring of course, which is a nice feature) and magerquark (lean homogenous cottage cheese) and that’s all! You can add an egg and/or baking powder too but it’s fine as is too.

Puffy dough is the trickiest one—the puffiness comes from bubbles in the dough and it takes extra effort to do that. The easiest way is simply to add baking powder (or baking soda reacting with vinegar) to the dough, the other conventional ways are to prepare yeast (cultured or uncultured, either way it takes time and some effort) or make bubbles from eggs which requires some skill that I lack (so I stick to baking powder). There are two recipes that work for me: mixing flour, eggs, butter and sugar (aka the usual cake mix) or öl-quark dough with sugar, egg and baking powder.

Runny dough (is it called batter?) can be made by mixing flour with a lot of liquid and some eggs and then used to make pancakes. Since it’s the only thing I’ve done with it so far let’s talk about them.

There are several kinds of pancakes that I know and tried so far:

  • French-style thin pancakes (aka crêpes) that are better eaten fresh with something rolled in;
  • Dutch laughably small pancakes (that have a name almost like an Australian word for gay—probably the words have the same origin);
  • common pancakes—thicker than crêpe, plain, good to eat with something on top or with some filling rolled in;
  • slightly thicker pancakes with something embedded in them (like bits of ham).

And of course Sweden has nice varieties of pancakes in wide range: ordinary pancakes, pancakes with bits of ham, pancakes with potatoes (I tried those and approve) and pancakes for people like me who can’t do anything right with their hands (including flipping pancakes)—ugnspannkaka, i.e. pancakes baked in oven. Obviously that one is much thicker than the rest but it’s easy to make (even I baked some) and it can embed various stuff too which makes it interesting (bits of ham, fish or even fruit). Also this way you’re more likely to end with rectangular pancakes which I find to be a nicer and more versatile shape than usual round ones.

I forgot to mention one local thing—in Baden-Württemberg they have plain pancakes shredded into thin stripes, dried and then they add it to the served soups. It’s called Flädle and you can buy it in every local supermarket (even Aldi). It’s a nice addition to a soup IMO.

Okay, now back to doing anything but coding.

Rust in multimedia: unwieldy features

Sunday, March 18th, 2018

Today I wanted to talk about two features that are quite important for multimedia decoding but are quite inconvenient in the current state.

First, macros. I know that macros in Rust are both very powerful and quite flexible but they are hard to use for data definition and I ranted about it before. The problem is that quite often you have tables with some internal structure that would benefit from macro substitutions: if you have a codebook constructed from entries following patterns like a, b, -a, -b and a, b, a, -b, -a, b, -a, -b it would be easier and less error-prone to represent them as e.g. FLIP2#(a, b) and FLIP4#(a, b) inside the data definition. The problem is that macro! does not allow you to do that easily since it’s supposed to expand into valid statements (i.e. code or full data definitions). Of course you can work it around by making a set of macros to define the whole array and some bits inside it but that’s what makes it unwieldy. And that’s why I believe there should be another macro substitution mechanism, maybe named macro#, that would work just on data but it’d be much easier to use in that particular case.

The second issue is assembly integration. Despite Rust being fast and such it’s still better to write small critical functions in assembly. And obviously it would be better if Cargo supported including assembler files into crate. You can point out there’s stdsimd for using the power of SIMD without much hassle. I can point out that compiler-generated code is still far from being perfect even with intrinsics and assembly is still better; supporting querying SIMD capabilities via standard package is good though. And you can point out that there’s a special crate for supporting various files with various compilers/assemblers already. I’d say that it’s a bit too generic but at least it can serve as a base for what I need. Again, there’s more or less standard way to deal with assembly files so making a common standard is not hard.

And in the unlikely case somebody reads this and asks why I don’t form an RFC—from what I heard it involves proposing code as well and I don’t want to study the compiler nor waste days compiling it.

Dingo Pictures Works: Early Years

Friday, December 8th, 2017

Well, I intended to end my review but I was reminded that there are even more Dingo Pictures works that I’ve missed. So let’s look at those.
(more…)

Rust: Annoyance-Driven Design

Sunday, December 3rd, 2017

I’ve finally made NihAV decode RealVideo 2 content, including B-frames (there are still 4 video codecs to support (and I don’t have any samples for RMHD) and all audio codecs too so it’s a long way) and so I have some more words to say about Rust and my experience with it.

To me it looks like the most decisions on decompositions in Rust are the consequences of annoyance of making it other way? Too large structures mean you have to either pass too many arguments into new() or fill it with some defaults (and I’m pretty sure that #derive[Default] won’t save you with more complex types) and initialise to sane values later. In result it’s easier to split everything into smaller structures which are (at least) subjectively are much easier to handle, especially if you reference them as Option<YourStruct>. Modules and imports, on the other hoof, are more annoying to manage since you have to take care of proper dependencies, visibility and imports—in result I find it easier to import all stuff from all modules and just keep comment out currently unused imports (because I still can’t bring myself to make it all a single mega-module). And now for the even higher level: crates. Yes, I’m going to beat that undead horse again.

First of all, I’m aware of incremental building enabled in nocturnal Rust but I’m not going to use nightly for rather obvious reasons (mostly because I’m not here to experiment with the all potential bells and whistles of the language but rather what it can offer right out of the box and how it suits my needs). So, the compilation times are horrible: when I change a single non-public function it rebuilds the whole crate (which is supposed behaviour, I know) and it takes 15 seconds to do that. Obviously it’s laughable for people doing “serious” projects but it’s basic fact that humans expect response (any response) in about five seconds after the action or they get impatient. In result instead of one crate with optional features (in my case decoders and demuxers) I’d rather have several smaller crates and that creates new issues too. There’s this obvious npm.js kind of issue of making packages for every small thing so your programs ends with more package dependencies than modern Linux distribution. But there’s also the issue with package splitting: I’d like to split my code into packages that encompass certain family of features—e.g. nihav-core for common stuff, nihav-avi for AVI demuxer, nihav-indeo for all Indeo codecs (audio and video) and nihav-realmedia for RealMedia demuxer and related codecs—then some of them may depend on some common package (like H.263 common core for Intel I.263 and RealVideo 1 and 2 decoders) but probably with different features requested (one of them does not need B-frame support, another one does not need PB-frame support). Since I don’t know quantum cargodynamics I don’t know how it will all be resolved. So it will either end in dead code or code duplication (in an additional crate too, I suppose).

My theory is that people behind Rust are biased by their development environment. In other words you don’t care much about compilation times when you have to build browsers (or compilers) on daily basis. While my main development machine is a laptop I bought in 2010 with 8GB of RAM (which I believed to be future-proof). So the Rust language designers might either have beefy machines to deal with fast compilation or be conditioned to long development cycles. I know that back in the day “start compiling Linux kernel and go make some coffee to pass 45 minutes of compilation time” was quite common but I guess it’s Jevons’ paradox all over again: the more computing power is there the more it’s wasted on compilation times. Like modern C++ or single-header libraries: you actually have to compile a very large corpus of code as single file. Back in the days my laptop with 64MB RAM was spending most of the time compiling libavcodec/dsputil.c (a monstrous file full of templates that old FFmpeg developers might remember even today) so I had to install more RAM in order to make compilation time reasonable. The solution was to split the file instead of upgrading the machines for every developer but nowadays it’d be seen as a ridiculous solution.

And now documentation. I find it rather poor (but that’s common with programming languages). If I know more or less what feature I want I can find it in the standard documentation (if I don’t I would complain about non-overlapping multiple &mut [range] borrows not working instead of using slice.split_at_mut()—and I did) but it does not really tell me what I should be looking for in the first place. I call it Excel complexity. In Excel there’s probably a function that does anything you want but it’s much easier to reimplement it yourself than to look up in the documentation how it’s called and what are its less obvious parameters. And even if you combine both The Rust Programming Language Second Edition and Rust By Example you still won’t get it right. Now that Rust aspires to be a JavaScript replacement it should take an example from it too: provide extensive overview how to do things in it instead of showcasing features. IMO in TRPLv2 there are two chapters—11 and 12—that are close to that ideal: they talk about testing and how to make a console program. In other words, good practical tasks that one would like to achieve with Rust (in other words, not so many people care about features per se, they want something done with a language: build multi-threaded application, parse Web server reply, make an efficient number cruncher etc etc). I can rant more about how it should be organised but nobody reads documentation including me.

There’s still this annoyance with tuples as such too: why I can’t declare let foo, bar; if baz { foo = 4; bar = 2; } else { foo = bar = 0; } and have to use two separate lets? why I can’t have let (foo, bar); if baz { (foo, bar) = (4, 2); } else { (foo, bar) = (0,0); } either? In result while named tuples are there I end up using only unnamed tuples.

So while Rust offers some nice things it has not a very nice way to shape development. And this also explains why C was so popular and still is: it does not enforce any particular behaviour on you (except in recent editions when the standard and compilers suddenly started to care about arithmetic and bit operations being non-portable—you might make your own CPU that does not use two’s complement arithmetic after all), no enforced coding style, you can compile code in any order you like and interface almost anything without special tools or wrappers. And the freedom it offered along with effectiveness is what is often lacking in more modern languages (the saddest thing is that it’s traded not for memory security but rather for sacks of syntactic sugar).

Anyway, I’ll keep experimenting and we’ll see how things will turn out. In either case I should start thinking about splitting NihAV into several crates, registering codecs and such. Too much work, too many opportunities to procrastinate!

koda

Thursday, November 9th, 2017

Dedicated to all young werehedgehogs.

xkcd.com/1882/ — one URL worth thousand words

So, let’s talk about colour in multimedia. To summarise it so you can skip the rest: proper colour representation hardly matters at all.

What is colour from physical point of view? It’s a property of light in visible range (i.e. between infrared and ultraviolet though some people are born without proper UV filters). Even better, you can clearly define it via spectroscopy because it’s a mix of certain wavelengths with certain energies. Another approach is to have reference colours printed on some surface (aka Pantone sets)—and that is the very thing you use to make sure you get what you want when taking a photo (especially on other celestial body) or ensuring consistency of production at typography.

The problem is that either approach is too bulky for use outside certain specific areas, for example it’s too expensive to store the whole spectrum for each pixel even in palette form (also image or video compression would be extremely inconvenient). Good thing is that our eye has its own variant of psychoacoustic masking and you can use several basic colours to achieve the mix. And from this most colour models (or spaces) were born where the range of real (aka present in spectrum) and perceivable colours (like purple or white, which are a mix of several colours) are represented as a composition of some primary colours like red+green+blue or cyan-magenta-yellow. And of course there is famous CIE 1931 model with basis being theoretical components corresponding to sensitivity of cone cells in human eye.

And there came the other problem: most colourspaces (XYZ, HSV and such) are as good as π-based computing system—it’s incredibly convenient for certain kinds of calculations but it’s next to impossible to convert results from and into decimal with good precision. Even RGB with its primary colours widely available has a problem: for instance, the colour of sky outside Britain (in case you didn’t know the etymology of word ‘sky’, it comes from Scandinavian word for cloud) can be represented only with IIRC red component being negative.

So how to deal with it? By mostly not caring as humans usually do. In places where higher colour reproduction fidelity is required (mostly typography) they simply use more primary colours. But overall humans don’t care much if the colours are slightly wrong. On one side, human brain has an internal auto-correction scheme for colour tint and white auto-balance (you might remember that optical illusion with seemingly red strawberries covered by green or blue tint with no pixels being actually red); on the other side each pair of human eyes is unique and perceives colours and shades differently. So if most people won’t agree about actual shade and would recognize picture anyway why bother at all (again, some specific areas excluded)?

So all those TV-related standards that define fine details of colour models are good only for mastering stuff (i.e. to keep consistency for the final product because you might not care about colour being slightly wrong but you’ll spot slight shade mismatch for sure). And speaking about TV-related standards, so-called TV-range (i.e. having component values fit into 16-240 range instead of 0-255 as you’d expect) is an archaism that should’ve been buried long time ago along with analogue TV broadcasting. But it still exists in digital world standards along with interlacing and KROPPING! not fully purged yet.

And speaking about shade differences, some of you might remember the era of VGA where each component actually had only 64 possible values and yet it was enough to create very convincing moving pictures. You may argue that the underlying issue was masked by palette mode I should point out that for rather long time after that people had to live with laptops and displays that had cheap LCDs with actual 18-bit colour depth (i.e. the same 6 bits per component as on VGA) as well (and let’s not talk about black colour representation there). So people didn’t care much about that and all this high-bitdepth stuff seems to be more of marketing creation than actual technical necessity (again, I understand that it’s needed somewhere like medical imaging, but common people don’t care about quality).

In the conclusion I want to say that the main reasons for introducing higher bitdepth wherever possible are: because we can (I understand and respect that), because it keeps many engineers and marketers employed (I understand that but don’t agree much) and because it helps fixing some other problems introduced elsewhere (like TV-range helped to deal with filtering artefacts—that I understand as well and try to respect but fail). Now be a good hedgehog and set proper colour profile in IMF metadata.

Dingo Pictures Works: Classics pt. 2 and Final Thoughts

Sunday, November 5th, 2017

Sadly, all good things come to an end and this series review is no different. Let’s look at the last three entries before I give my opinion on all of them.
(more…)

Some Impressions on Czech Railways

Sunday, November 5th, 2017

I’ve finally travelled enough Czech railways (mostly in the South-western part of the country) to form some impressions about them.

First, they have somewhat funny train terminology there: R means “rychlik” or express train while R-egional trains are marked as Os or “osobni” but in reality they all move with speed around 50 km/h.

Second, the rolling stock.
Typical locomotive
The trains are usually two-four carriages dragged by locomotive, most often like on the picture above. It brings nostalgia to me because it looks like a Škoda train from 1960s that was one of the best locomotives in the USSR, and it was also nicknamed Cheburashka because it both looked a bit like a titular hero of that anime (formerly Soviet cartoon) and featured there as well. You can also see rail buses, double-decker regional trains (the same as InterCity trains in Ukraine) and some other types but they are very rare.

Speaking of locomotives, I had a brief visit to Austria and saw their main locomotive ÖBB 1044. And what do you know, it looks like a replica of Rc-locomotive from Sweden. And then you read that Austrian Railways actually bought ten Rc2 from Sweden and designated them as ÖBB 1043 locomotives. Since Rc2 was the best locomotive in Austria it’s no wonder they’ve designed the next model after it.

Third, tickets. Outside Prague you can buy tickets usually just at ticket office at the station or maybe at conductors (but I’ve never tried that), ticket offices accept Euros and sometimes you can pay with a card too (mind the signs there). Another funny thing is that tickets usually contain the stations you should pass on your route and they’re a lot like German tickets for regional trains—you just buy a ticket for a route, which train you choose is up to you. Even better that in most cases you can buy tickets outside country, like I’ve bought ticket Praha-Tábor in Dresden.

Fourth, infrastructure in general. And that’s where it sucks.
A station somewhere between Jihlava and České Budějovice

Station houses look like they were built either in XIXth century under Austrian rule or in 1970s under Soviet rule (those look like featureless boxes essentially) and many of them are not very well maintained unfortunately. Another thing is platforms. You can see typical Czech platform on the first picture. They are often about just twice as high as rails and not particularly wide too, you can meet high platforms only on big stations and very random places (IIRC I’ve seen one at Velesín Město and there’s just a single track there).

And now for the tracks themselves. Rail connectivity is very good there so you can get from one place to another without going through Prague, the downside is that it usually takes two hours to get from one node to another as I’ve mentioned above all trains travel with the speed around 50 km/h. I’ve travelled on routes Dresden-Praha, Linz-Prag, Praha-Schwandorf, Tábor-Jihlava and Jihlava-Plzeň and looks like only routes from Prague to important places like České Budějovice, Plzeň and such are double-track (and to Dresden for some reason), the rest are single-track and often are curvy as they were drawn with a tail of stubborn mule as we say here. Also track Tábor-Horní Cerekev is quite bumpy and reminds more of a typical Ukrainian road than railway.

In general, Czech railways leave an impression of railways in rural area and thus they have their inimitable charm. Throw in a nostalgic feeling from the locomotives and you can say I liked it despite all downsides.

Dingo Pictures Works: Classics pt. 1

Saturday, November 4th, 2017

Let’s look at last category of Dingo Pictures cartoons. Because it’s rather large I’ve split it into two parts.
(more…)

H.263 And MPEG-4 ASP—The Root of Some Evil

Saturday, November 4th, 2017

As you might know (but still not care), I’m working on adding full RealMedia support for NihAV starting with video. So I’ve made it to decoding RealVideo 2 and I have some not so nice words to say about H.263 and MPEG-4 ASP.

First, the creeping featuritis in the standards: MPEG-4 part 2 from 2001 has A-O (the version from 2004 has only annexes A-M for some reason) while ITU H.263 (version from 2005) has annexes A-X plus two appendices. For comparison, ITU H.264 from 2017 has annexes A-J, same for MPEG-4 part 10 😉 Mind you, some annexes are for informative stuff (e.g. how an encoder should work or list of patent claims) but others add new coding features. So, for MPEG-4 part 2 (2001) we have 15 annexes, 7 of them are informative and only a couple of normative annexes add new features. For ITU H.263 out of 24 annexes about 15 are introducing new coding modes and other enhancements (different treating of motion vectors, loop filter, an alternative macroblock coding mode, PB-frame type and a lot more). The features are actually grouped into baseline(-ish) H.263 and H.263+.

Second, neither of them is really suitable for video coding. I know, it might sound strange, but either of these standards makes an unholy mix of various codecs. H.263 mixes several codecs from different generations together (initial H.263 did not have B-frames, later they’ve added PB-frames and finally B-frames too, there are at least two different ways to code macroblocks etc etc), MPEG-4 part 2 is for coding 3D video that actually also specifies a method to code video texture on those 3D shapes (there are no actual frames there, just VOPs—Video Object Planes). And yet, because the compression methods there provided an improvement over H.262 (aka MPEG-2 Video), they were used in various forms with various hacks in many multimedia formats. There we have a very wide gamut from RealVideo 1 and Sorenson Spark (aka FLV1) with just I- and P-frames to Intel I.263 that had PB-frames to RealVideo 2 with many features of H.263+ (including B-frames) to M$ MPEG-4 decoders to WMV2.

And here we have the problem: both format grew from the joint effort known as H.262 or MPEG-2 Video so obviously it was a good idea to abuse the same decoder structure to handle all possible variations of H.263 and video texture coding from MPEG-4 part 2 and then add all decoder-specific hacks. And in result you get a mess that’s hard to comprehend because it usually depends on many various context variables set in a specific manner for a specific codec. Hence the post title.

To demonstrate this I’ll show how the same feature is handled in different H.263/MP4p2-based codecs.

Sequence and frame headers

Obviously it differs for every codec. Some rely on container-provided width and height, some have dimensions coded for GOP or for individual frames, some codecs have only meaningful bits in the frame header, others store all feature bits and error out on unsupported configurations.

Frame types

  • Intel I.263: I, P, PB
  • RealVideo 1: I, P
  • RealVideo 2: I, P, B
  • Sorenson Spark: I, P, droppable P
  • WMV1: I, P
  • WMV2: I, P, X8(alternative I-frame coding)
  • H.263 in general: I, P, PB, B, EI, EP (last two are enhancement layer picture types for scalable coding)
  • MPEG-4: I, P, B and S (last one is sprite-coded picture)

Block coding

  • Intel I.263: H.263 codes
  • RealVideo 1: H.263 codes with a special codes for I-frame DCs
  • RealVideo 2: H.263+ AIC mode (advanced I-frame coding) plus H.263 P- or B-frames
  • Sorenson Spark: H.263 codes with a custom handling of AC escapes
  • WMV1/2: M$MPEG-4 codes

Motion vectors reconstruction

  • H.263: simply add predictor vector
  • H.263 UMV: depending on predictor value and difference range wrap it or not (see ITU H.263 D.2 for proper explanation)
  • MPEG-4: if (mv < low) mv += range; if (mv > high) mv -= range;
  • M$MPEG-4: if (mv < = -64) mv += 64; if (mv >= 64) mv -= 64;

(And there are different ways to predict motion vectors too!)

There are even more quirks than I listed here but it should give you an idea what a fine mess these formats are and why the code that supports them all tends to turn into huge mess. I tried to solve it in NihAV by having a template decoder for H.263 that calls bitstream parser for actual codec-specific parsing and keep some quirks inside specific structures (like MV that adds vectors differently depending on current mode) I still have more features to take into account (like slices, AC prediction and B-frames) so I’ll have to redesign it before I can support RealVideo 2 properly.

But then maybe I’ll add Vivo Media format support for the old times sake (it’s the funniest one with codebooks stored as strings of ones and zeroes like “0000 0011 110” inside the binary with “End” signalling the codebook end).