Archive for the ‘Useless Rants’ Category

Dingo Pictures Works: Early Years

Friday, December 8th, 2017

Well, I intended to end my review but I was reminded that there are even more Dingo Pictures works that I’ve missed. So let’s look at those.
(more…)

Rust: Annoyance-Driven Design

Sunday, December 3rd, 2017

I’ve finally made NihAV decode RealVideo 2 content, including B-frames (there are still 4 video codecs to support (and I don’t have any samples for RMHD) and all audio codecs too so it’s a long way) and so I have some more words to say about Rust and my experience with it.

To me it looks like the most decisions on decompositions in Rust are the consequences of annoyance of making it other way? Too large structures mean you have to either pass too many arguments into new() or fill it with some defaults (and I’m pretty sure that #derive[Default] won’t save you with more complex types) and initialise to sane values later. In result it’s easier to split everything into smaller structures which are (at least) subjectively are much easier to handle, especially if you reference them as Option<YourStruct>. Modules and imports, on the other hoof, are more annoying to manage since you have to take care of proper dependencies, visibility and imports—in result I find it easier to import all stuff from all modules and just keep comment out currently unused imports (because I still can’t bring myself to make it all a single mega-module). And now for the even higher level: crates. Yes, I’m going to beat that undead horse again.

First of all, I’m aware of incremental building enabled in nocturnal Rust but I’m not going to use nightly for rather obvious reasons (mostly because I’m not here to experiment with the all potential bells and whistles of the language but rather what it can offer right out of the box and how it suits my needs). So, the compilation times are horrible: when I change a single non-public function it rebuilds the whole crate (which is supposed behaviour, I know) and it takes 15 seconds to do that. Obviously it’s laughable for people doing “serious” projects but it’s basic fact that humans expect response (any response) in about five seconds after the action or they get impatient. In result instead of one crate with optional features (in my case decoders and demuxers) I’d rather have several smaller crates and that creates new issues too. There’s this obvious npm.js kind of issue of making packages for every small thing so your programs ends with more package dependencies than modern Linux distribution. But there’s also the issue with package splitting: I’d like to split my code into packages that encompass certain family of features—e.g. nihav-core for common stuff, nihav-avi for AVI demuxer, nihav-indeo for all Indeo codecs (audio and video) and nihav-realmedia for RealMedia demuxer and related codecs—then some of them may depend on some common package (like H.263 common core for Intel I.263 and RealVideo 1 and 2 decoders) but probably with different features requested (one of them does not need B-frame support, another one does not need PB-frame support). Since I don’t know quantum cargodynamics I don’t know how it will all be resolved. So it will either end in dead code or code duplication (in an additional crate too, I suppose).

My theory is that people behind Rust are biased by their development environment. In other words you don’t care much about compilation times when you have to build browsers (or compilers) on daily basis. While my main development machine is a laptop I bought in 2010 with 8GB of RAM (which I believed to be future-proof). So the Rust language designers might either have beefy machines to deal with fast compilation or be conditioned to long development cycles. I know that back in the day “start compiling Linux kernel and go make some coffee to pass 45 minutes of compilation time” was quite common but I guess it’s Jevons’ paradox all over again: the more computing power is there the more it’s wasted on compilation times. Like modern C++ or single-header libraries: you actually have to compile a very large corpus of code as single file. Back in the days my laptop with 64MB RAM was spending most of the time compiling libavcodec/dsputil.c (a monstrous file full of templates that old FFmpeg developers might remember even today) so I had to install more RAM in order to make compilation time reasonable. The solution was to split the file instead of upgrading the machines for every developer but nowadays it’d be seen as a ridiculous solution.

And now documentation. I find it rather poor (but that’s common with programming languages). If I know more or less what feature I want I can find it in the standard documentation (if I don’t I would complain about non-overlapping multiple &mut [range] borrows not working instead of using slice.split_at_mut()—and I did) but it does not really tell me what I should be looking for in the first place. I call it Excel complexity. In Excel there’s probably a function that does anything you want but it’s much easier to reimplement it yourself than to look up in the documentation how it’s called and what are its less obvious parameters. And even if you combine both The Rust Programming Language Second Edition and Rust By Example you still won’t get it right. Now that Rust aspires to be a JavaScript replacement it should take an example from it too: provide extensive overview how to do things in it instead of showcasing features. IMO in TRPLv2 there are two chapters—11 and 12—that are close to that ideal: they talk about testing and how to make a console program. In other words, good practical tasks that one would like to achieve with Rust (in other words, not so many people care about features per se, they want something done with a language: build multi-threaded application, parse Web server reply, make an efficient number cruncher etc etc). I can rant more about how it should be organised but nobody reads documentation including me.

There’s still this annoyance with tuples as such too: why I can’t declare let foo, bar; if baz { foo = 4; bar = 2; } else { foo = bar = 0; } and have to use two separate lets? why I can’t have let (foo, bar); if baz { (foo, bar) = (4, 2); } else { (foo, bar) = (0,0); } either? In result while named tuples are there I end up using only unnamed tuples.

So while Rust offers some nice things it has not a very nice way to shape development. And this also explains why C was so popular and still is: it does not enforce any particular behaviour on you (except in recent editions when the standard and compilers suddenly started to care about arithmetic and bit operations being non-portable—you might make your own CPU that does not use two’s complement arithmetic after all), no enforced coding style, you can compile code in any order you like and interface almost anything without special tools or wrappers. And the freedom it offered along with effectiveness is what is often lacking in more modern languages (the saddest thing is that it’s traded not for memory security but rather for sacks of syntactic sugar).

Anyway, I’ll keep experimenting and we’ll see how things will turn out. In either case I should start thinking about splitting NihAV into several crates, registering codecs and such. Too much work, too many opportunities to procrastinate!

koda

Thursday, November 9th, 2017

Dedicated to all young werehedgehogs.

xkcd.com/1882/ — one URL worth thousand words

So, let’s talk about colour in multimedia. To summarise it so you can skip the rest: proper colour representation hardly matters at all.

What is colour from physical point of view? It’s a property of light in visible range (i.e. between infrared and ultraviolet though some people are born without proper UV filters). Even better, you can clearly define it via spectroscopy because it’s a mix of certain wavelengths with certain energies. Another approach is to have reference colours printed on some surface (aka Pantone sets)—and that is the very thing you use to make sure you get what you want when taking a photo (especially on other celestial body) or ensuring consistency of production at typography.

The problem is that either approach is too bulky for use outside certain specific areas, for example it’s too expensive to store the whole spectrum for each pixel even in palette form (also image or video compression would be extremely inconvenient). Good thing is that our eye has its own variant of psychoacoustic masking and you can use several basic colours to achieve the mix. And from this most colour models (or spaces) were born where the range of real (aka present in spectrum) and perceivable colours (like purple or white, which are a mix of several colours) are represented as a composition of some primary colours like red+green+blue or cyan-magenta-yellow. And of course there is famous CIE 1931 model with basis being theoretical components corresponding to sensitivity of cone cells in human eye.

And there came the other problem: most colourspaces (XYZ, HSV and such) are as good as π-based computing system—it’s incredibly convenient for certain kinds of calculations but it’s next to impossible to convert results from and into decimal with good precision. Even RGB with its primary colours widely available has a problem: for instance, the colour of sky outside Britain (in case you didn’t know the etymology of word ‘sky’, it comes from Scandinavian word for cloud) can be represented only with IIRC red component being negative.

So how to deal with it? By mostly not caring as humans usually do. In places where higher colour reproduction fidelity is required (mostly typography) they simply use more primary colours. But overall humans don’t care much if the colours are slightly wrong. On one side, human brain has an internal auto-correction scheme for colour tint and white auto-balance (you might remember that optical illusion with seemingly red strawberries covered by green or blue tint with no pixels being actually red); on the other side each pair of human eyes is unique and perceives colours and shades differently. So if most people won’t agree about actual shade and would recognize picture anyway why bother at all (again, some specific areas excluded)?

So all those TV-related standards that define fine details of colour models are good only for mastering stuff (i.e. to keep consistency for the final product because you might not care about colour being slightly wrong but you’ll spot slight shade mismatch for sure). And speaking about TV-related standards, so-called TV-range (i.e. having component values fit into 16-240 range instead of 0-255 as you’d expect) is an archaism that should’ve been buried long time ago along with analogue TV broadcasting. But it still exists in digital world standards along with interlacing and KROPPING! not fully purged yet.

And speaking about shade differences, some of you might remember the era of VGA where each component actually had only 64 possible values and yet it was enough to create very convincing moving pictures. You may argue that the underlying issue was masked by palette mode I should point out that for rather long time after that people had to live with laptops and displays that had cheap LCDs with actual 18-bit colour depth (i.e. the same 6 bits per component as on VGA) as well (and let’s not talk about black colour representation there). So people didn’t care much about that and all this high-bitdepth stuff seems to be more of marketing creation than actual technical necessity (again, I understand that it’s needed somewhere like medical imaging, but common people don’t care about quality).

In the conclusion I want to say that the main reasons for introducing higher bitdepth wherever possible are: because we can (I understand and respect that), because it keeps many engineers and marketers employed (I understand that but don’t agree much) and because it helps fixing some other problems introduced elsewhere (like TV-range helped to deal with filtering artefacts—that I understand as well and try to respect but fail). Now be a good hedgehog and set proper colour profile in IMF metadata.

Dingo Pictures Works: Classics pt. 2 and Final Thoughts

Sunday, November 5th, 2017

Sadly, all good things come to an end and this series review is no different. Let’s look at the last three entries before I give my opinion on all of them.
(more…)

Some Impressions on Czech Railways

Sunday, November 5th, 2017

I’ve finally travelled enough Czech railways (mostly in the South-western part of the country) to form some impressions about them.

First, they have somewhat funny train terminology there: R means “rychlik” or express train while R-egional trains are marked as Os or “osobni” but in reality they all move with speed around 50 km/h.

Second, the rolling stock.
Typical locomotive
The trains are usually two-four carriages dragged by locomotive, most often like on the picture above. It brings nostalgia to me because it looks like a Škoda train from 1960s that was one of the best locomotives in the USSR, and it was also nicknamed Cheburashka because it both looked a bit like a titular hero of that anime (formerly Soviet cartoon) and featured there as well. You can also see rail buses, double-decker regional trains (the same as InterCity trains in Ukraine) and some other types but they are very rare.

Speaking of locomotives, I had a brief visit to Austria and saw their main locomotive ÖBB 1044. And what do you know, it looks like a replica of Rc-locomotive from Sweden. And then you read that Austrian Railways actually bought ten Rc2 from Sweden and designated them as ÖBB 1043 locomotives. Since Rc2 was the best locomotive in Austria it’s no wonder they’ve designed the next model after it.

Third, tickets. Outside Prague you can buy tickets usually just at ticket office at the station or maybe at conductors (but I’ve never tried that), ticket offices accept Euros and sometimes you can pay with a card too (mind the signs there). Another funny thing is that tickets usually contain the stations you should pass on your route and they’re a lot like German tickets for regional trains—you just buy a ticket for a route, which train you choose is up to you. Even better that in most cases you can buy tickets outside country, like I’ve bought ticket Praha-Tábor in Dresden.

Fourth, infrastructure in general. And that’s where it sucks.
A station somewhere between Jihlava and České Budějovice

Station houses look like they were built either in XIXth century under Austrian rule or in 1970s under Soviet rule (those look like featureless boxes essentially) and many of them are not very well maintained unfortunately. Another thing is platforms. You can see typical Czech platform on the first picture. They are often about just twice as high as rails and not particularly wide too, you can meet high platforms only on big stations and very random places (IIRC I’ve seen one at Velesín Město and there’s just a single track there).

And now for the tracks themselves. Rail connectivity is very good there so you can get from one place to another without going through Prague, the downside is that it usually takes two hours to get from one node to another as I’ve mentioned above all trains travel with the speed around 50 km/h. I’ve travelled on routes Dresden-Praha, Linz-Prag, Praha-Schwandorf, Tábor-Jihlava and Jihlava-Plzeň and looks like only routes from Prague to important places like České Budějovice, Plzeň and such are double-track (and to Dresden for some reason), the rest are single-track and often are curvy as they were drawn with a tail of stubborn mule as we say here. Also track Tábor-Horní Cerekev is quite bumpy and reminds more of a typical Ukrainian road than railway.

In general, Czech railways leave an impression of railways in rural area and thus they have their inimitable charm. Throw in a nostalgic feeling from the locomotives and you can say I liked it despite all downsides.

H.263 And MPEG-4 ASP—The Root of Some Evil

Saturday, November 4th, 2017

As you might know (but still not care), I’m working on adding full RealMedia support for NihAV starting with video. So I’ve made it to decoding RealVideo 2 and I have some not so nice words to say about H.263 and MPEG-4 ASP.

First, the creeping featuritis in the standards: MPEG-4 part 2 from 2001 has A-O (the version from 2004 has only annexes A-M for some reason) while ITU H.263 (version from 2005) has annexes A-X plus two appendices. For comparison, ITU H.264 from 2017 has annexes A-J, same for MPEG-4 part 10 😉 Mind you, some annexes are for informative stuff (e.g. how an encoder should work or list of patent claims) but others add new coding features. So, for MPEG-4 part 2 (2001) we have 15 annexes, 7 of them are informative and only a couple of normative annexes add new features. For ITU H.263 out of 24 annexes about 15 are introducing new coding modes and other enhancements (different treating of motion vectors, loop filter, an alternative macroblock coding mode, PB-frame type and a lot more). The features are actually grouped into baseline(-ish) H.263 and H.263+.

Second, neither of them is really suitable for video coding. I know, it might sound strange, but either of these standards makes an unholy mix of various codecs. H.263 mixes several codecs from different generations together (initial H.263 did not have B-frames, later they’ve added PB-frames and finally B-frames too, there are at least two different ways to code macroblocks etc etc), MPEG-4 part 2 is for coding 3D video that actually also specifies a method to code video texture on those 3D shapes (there are no actual frames there, just VOPs—Video Object Planes). And yet, because the compression methods there provided an improvement over H.262 (aka MPEG-2 Video), they were used in various forms with various hacks in many multimedia formats. There we have a very wide gamut from RealVideo 1 and Sorenson Spark (aka FLV1) with just I- and P-frames to Intel I.263 that had PB-frames to RealVideo 2 with many features of H.263+ (including B-frames) to M$ MPEG-4 decoders to WMV2.

And here we have the problem: both format grew from the joint effort known as H.262 or MPEG-2 Video so obviously it was a good idea to abuse the same decoder structure to handle all possible variations of H.263 and video texture coding from MPEG-4 part 2 and then add all decoder-specific hacks. And in result you get a mess that’s hard to comprehend because it usually depends on many various context variables set in a specific manner for a specific codec. Hence the post title.

To demonstrate this I’ll show how the same feature is handled in different H.263/MP4p2-based codecs.

Sequence and frame headers

Obviously it differs for every codec. Some rely on container-provided width and height, some have dimensions coded for GOP or for individual frames, some codecs have only meaningful bits in the frame header, others store all feature bits and error out on unsupported configurations.

Frame types

  • Intel I.263: I, P, PB
  • RealVideo 1: I, P
  • RealVideo 2: I, P, B
  • Sorenson Spark: I, P, droppable P
  • WMV1: I, P
  • WMV2: I, P, X8(alternative I-frame coding)
  • H.263 in general: I, P, PB, B, EI, EP (last two are enhancement layer picture types for scalable coding)
  • MPEG-4: I, P, B and S (last one is sprite-coded picture)

Block coding

  • Intel I.263: H.263 codes
  • RealVideo 1: H.263 codes with a special codes for I-frame DCs
  • RealVideo 2: H.263+ AIC mode (advanced I-frame coding) plus H.263 P- or B-frames
  • Sorenson Spark: H.263 codes with a custom handling of AC escapes
  • WMV1/2: M$MPEG-4 codes

Motion vectors reconstruction

  • H.263: simply add predictor vector
  • H.263 UMV: depending on predictor value and difference range wrap it or not (see ITU H.263 D.2 for proper explanation)
  • MPEG-4: if (mv < low) mv += range; if (mv > high) mv -= range;
  • M$MPEG-4: if (mv < = -64) mv += 64; if (mv >= 64) mv -= 64;

(And there are different ways to predict motion vectors too!)

There are even more quirks than I listed here but it should give you an idea what a fine mess these formats are and why the code that supports them all tends to turn into huge mess. I tried to solve it in NihAV by having a template decoder for H.263 that calls bitstream parser for actual codec-specific parsing and keep some quirks inside specific structures (like MV that adds vectors differently depending on current mode) I still have more features to take into account (like slices, AC prediction and B-frames) so I’ll have to redesign it before I can support RealVideo 2 properly.

But then maybe I’ll add Vivo Media format support for the old times sake (it’s the funniest one with codebooks stored as strings of ones and zeroes like “0000 0011 110” inside the binary with “End” signalling the codebook end).

Dingo Pictures Works: For The Youngest Ones

Tuesday, October 24th, 2017

So, it’s time for spotlighting even more Dingo Pictures cartoon! And today we’re talking about the cartoons oriented at the youngest audience (even though all Dingo Pictures cartoons are rated as FSK 0—German version of Hays code saying “appropriate for ages 0 or older”—some of them are for more grown up audience, little children won’t be able to appreciate them).
(more…)

Dingo Pictures Works: Adventures

Tuesday, October 3rd, 2017

This category can be alternatively titled wild animal adventures and it contains probably the most famous Dingo Pictures cartoons.
(more…)

Dingo Pictures Works: Fairy Tales

Friday, September 22nd, 2017

There are only three stories in this category and six-seven in the remaining ones so I don’t have to split this post into two parts.
(more…)

Dingo Pictures Works: Thrillers

Monday, September 11th, 2017

Today I’m covering the great works from Dingo Pictures. I intend to split the review into roughly the same categories as they are put on the official website and today we start with the first section. Its name is “Krimis” in German which I think is more appropriately translated into “thriller” than “mystery story” or “detective story”.
(more…)