Archive for November, 2015

NihAV — NAScale

Saturday, November 21st, 2015

First, some history. If you don’t like reading about it just skip to the ruler below.

So, NAScale is born after yet another attempt to design a good colourspace conversion and scaling library. Long time ago FFmpeg didn’t have any proper solution for that and used rather rudimentary imgconvert; later it was replaced with libswscale lifted from MPlayer. Unfortunately it was designed for rather specific player tasks (mostly converting YUV to RGB for displaying with X11 DGA driver) rather than generic utility library and some of its original design still shows to this day. Actually, libswscale should have a warm place in every true FFmpeg developer’s heart next to MPEGEncContext. Still, while being far from ideal it has SIMD optimisations and it works, so it’s still being used.

And yet some people unsatisfied with it decided to write a replacement from scratch. Originally AVScale (a Libav™ project) was supposed to be designed during coding sprint in Summer 2014. And of course nothing substantial came out of it.

Then I proposed my vision how it should work and even wrote a proof of concept (i.e. throwaway) code to demonstrate it back in Autumn 2014. I’d made an update to it in March 2015 to show how to work with high bitdepth formats but nobody has touched it since then (and hardly before that too). Thus I’m reusing that failing effort as NAScale for NihAV.


And now about the NAScale design.

The main guiding rule was: “see libswscale? Don’t do it like this.

First, I really hate long enums and dislike API/ABI breaks. So NAScale should have stable interface and no enumeration of known pixel formats. What should it have instead? Pixel format description that should be good enough to make NAScale convert even formats it had no idea about (like BARG5156 to YUV412).

So what should such description have? Colourspace information (RGB/YUV/XYZ/whatever, gamma, transfer function etc), size of whole packed pixel where applicable (i.e. not for planar formats) and individual component information. That component information includes information on how to find and/or extract such component (i.e. on which plane it is located, what shift and mask is needed to extract it from packed bitfield, how many bytes to skip to find the first and next component etc.) and subsampling information. The names chosen for those descriptors were Formaton and Chromaton (for rather obvious reasons).

Second, the way NAScale processes data. As I remember it libswscale converted input into YUV with fixed precision before scaling and then back into destination format unless it was common case format conversion without scaling (and then some bypass hacks were employed like plane repacking function and such).

NAScale prefers to build filter chain in stages. Each stage has either one function processing all components or a function processing only one component applied to each component — that allows you to execute e.g. scaling in parallel. It also allows to build proper conversion+scaling process without horrible hacks. Each stage might have its own temporary buffers that will be used for output (and fed to the next stage).

You need to convert XYZ to YUV? First you unpack XYZ into planar RGB (stage 1), then scale it (stage 2) and then convert it to YUV (stage 3). NAScale constructs chain by searching for kernels that can do the work (e.g. convert input into some intermediate format or pack planes into output format), provides that kernel with a Formaton and dimensions and that kernels sets stage processing functions. For example, the first stage of RGB to YUV is unpacking RGB data, thus NAScale searches for the kernel called rgbunp, which sets stage processing function and allocated RGB plane buffers, then the kernel called rgb2yuv will convert and pack RGB data from the planes into YUV.

And last, implementation. I’ve written some sample code that would be able to take RGB input (high bitdepth was supported too), scale it if needed and pack back into RGB or convert into YUV depending on what was requested. So test program converted raw r210 frame into r10k or input PPM into PPM or PGMYUV with scaling. I think it’s enough to demonstrate how the concept works. Sadly nobody has picked this work (and you know whom I blame for that; oh, and koda — he wanted to be mentioned too).

Sprint Report

Friday, November 20th, 2015

Last week I’ve been attending a fifth Libav coding sprint in Pelh?imov. Here’s a report from the host of the current sprint (she was amazing, many thanks). It was fun indeed (some fun provided by Lufthansa canceling my flight because of strike).

It’s the third sprint I attended and there’s a pattern in them. Even if they are called coding sprints there’s not so much coding going there, it’s more discussing various stuff and food than actual coding (though Anton spends a lot of time fixing some decoder usually). As I’m no longer a Libav developer, I simply hang around and try to provide enough trolling and proper drinks, sometimes even sharing a bit of knowledge if anybody wants to listen.

Another recurring theme is AVScale, a saner replacement for libswscale. It gets discussed during the sprints (since Summer 2014) but nothing substantial gets done. The only things we got so far are my proof of concept implementation (I’ll present it later in my post about NAScale, it’s the same thing) and something hacked by 1-2 Italians under influence of alcohol in a couple of hours (with great comments like //luzero doesn't remember this) that has Libav integration but no functionality (my code is complete opposite of that — standalone and doing some useful functions). Well, just wait for new posts about this, they’ll appear eventually.

In general I attend sprints just to see people and new places and have some fun. Among the places where sprints took place — Pelh?imov, Stockholm and Turin — I’d say Stockholm was the best so far as it is the easiest to reach and has the best food to my taste (plus thanks to SouthPole AB hosting more people than at any other sprint). I should attend a sprint again some time.

P.S. I still blame lu_zero for not writing anything about it yet.

Blogposts missing

Tuesday, November 17th, 2015

I like a good read with analysis of random stuff. The problem is that the planned series of posts on parallels between social organisations and software development are not written yet.

The first post would be dedicated to the idea supplanted by bureaucracy. This happens when somebody has a good idea like “let’s create X foundation to promote good thing Y” and with time it turns into monopoly that dictates the rules to everybody else in that area. A special attention would be paid to the ways such organisations maintain “democratic” façade while making all decisions in private — like with introducing many almost useless contributors into voters and convincing them to vote in the way organisers want. An addendum about how such non-profit organisations get their incredible amount of money would be good too. And if you’re still in need of an example — think about any sports organisation like IOC or FIFA.

The second post would be about cultists — not people adhering to some religion but rather people blindly following something and not even thinking that other people might have other views and needs. For example, iUsers or those who program in Oberon dialects (that seems to include Go despite it being not a direct descendant). The main problem with them is that those cultists usually force their fetish as the only proper solution and the main answer to “but I want to do that” is “it’s not needed and I’ve never did it myself”.

The third post would be about a common tactic switching from peculiar details when advertising stuff to blanket statements when defending from criticism (or vice versa). Among other things it’s very common for Oberon and systemd advocates in a fashion like “Feature X? We have it right in Y. Y sucks? Well, it’s just a single dialect/module, the whole system is wonderful, stop attacking it.” It would be painful but giving parallels between this and terrorists/peaceful Islam would be proper too.

And of course I blame lu_zero for not writing this.

Freudian Slip?

Saturday, November 7th, 2015

Even if I’m no longer Libav or FFmpeg developer I still look at both projects’ development mailing lists (on FFmpeg’s one mostly in faint hope that Peter Ross submits anything awesome again).

So one day I see this message. The “former” leader calls a large share of commits they get “enemy merges” (and it cannot be humorous, it’s not mean enough to be Austrian humor). Well, nice attitude you have there. And you know what? This might be a semi-official position there.

I was present at FFmpeg-Libav discussion at VDD (since I was not noticed by Jean-Baptiste I remained while other outsiders were kicked out — here’s the recording of public part). There I even managed to ask a single question — what’s really changed since Michael’s resignation. FFmpeg people failed to answer that. Beside not making merges anymore Michael still announces and makes releases and does whatever changes he likes without reviews; he’s still a de facto leader in my opinion. I’m yet to see FFmpeg having defined rules stating something different (even Libav has something). Another fun fact from that meeting was some FFmpeg people openly stating they hate Libav merely because it exists.

Again, I don’t have to care about FFmpeg community but working at Libav in such conditions is no fun either (and it’s no fun for many other reasons many of which sadly have something to do with FFmpeg).

So I’d rather follow the advice from the great philosopher Eric Theodor Cartman — “screw you guys, I’m going home”. Developing NihAV at slow pace (i.e. when I feel like doing it) in a neutral one-developer atmosphere is much better.