Archive for May, 2023

A quick glance at the original Cinepak encoder

Friday, May 26th, 2023

Since I don’t have anything to do with NihAV at the time (beside two major tasks that always make me think about doing anything else but them) I decided to look at what tricks did the original Cinepak encoder have.

Apparently it has essentially three settings: interval between key frames (with maximum and minimum values), temporal/spatial quality (for deciding which kinds of coding should be used) and neighbour radius (probably for merging close enough values before actual codebook is calculated).

Skip blocks are decided by sum of squared differences being smaller than the threshold (calculated from the time quality); V1/V4 coding is decided by calculating sum of 2×2 sub-block variances and comparing it against the threshold (calculated from spatial quality).

Codebook creation is done by grouping all blocks into five bins (by logarithm of the variance) and trying to calculate a smaller codebook for each bin independently (so together they’ll make up the full 256-entry codebook).

Overall even if I’m not going to copy that approach it was still interesting to look at.

On the origins of ruscism

Wednesday, May 17th, 2023

A couple of weeks ago Ukrainian parliament has finally recognized this term on the official level and listed several telltale signs of it. But in my opinion they can be boiled down to two main actions: disregarding the laws, agreements and traditions (if some suckers believe in those—then it’s just easier to swindle them) and constantly lying, often in an unconvincing way and usually by attributing own deficiencies to somebody else. They’ve been behaving like that throughout their history (which is partly stolen and partly fictitious), the wars just make it more visible. So, why russians behave like that?

Fascism and Nazism grow to power using the support of the second-worst kind of people: people who feel offended or wronged and do not think for themselves. That sort of folks would never blame themselves for their own faults and will gladly follow a leader who has simple answers to questions like who’s guilty and what to do (those answers are usually “that certain group of people” and “unite around me and do what I tell”). In case of ruscism, I believe it’s not merely an ideology that unites the nation but rather the idea that defines this entity (you’ll see why I don’t consider them a nation soon).

One researcher described russians as a dynamic community where everybody can belong to it or fall from it depending on circumstances (or rather benefits it gives: if I need something from you then you’re my brother, if you need something from me then I don’t know you). From this a rather obvious conclusion follows: russians have failed to develop as a nation—even small tribes usually have clear definition of who belongs to them and who are outsiders—and it must be something immaterial uniting them (i.e. an idea). Nations have not merely clearly defined rules of belonging but also clearly defined territory (no matter if it’s the historical settlement are or pieces of land wrestled from somebody else)—russians claim that russia has no borders and that any territory where a russian has been is a part of russia (IIRC just last year some russian dropped a piece of dirt on Dubai beach and claimed that now it’s all russian soil; I’ve encountered many more examples where common russians believed that some place is russian because they’ve been there).

If you look at the real russian history, it starts with the principality of Suzdal, created on the territories inhabited mostly by Finnic and Ugrian people, conquered by the Golden Horde and after its fall proclaiming itself a legitimate successor and capturing other lands (usually not inhabited by Slavic people either) and yet they tried to turn this multi-ethnic mix into “russians”, partially succeeding at that. Last year the russian führer made a speech that he belongs to all nationalities living in russia—what has not been said is that all those nations are russian only as long as they’re going to war, if they try to move to moscow they’ll be greeted with the traditional “go back to your shithole you non-russian hick” (but if they die at war they’ll be called as “true russian heroes” anyway).

It is hard to define the idea that unites them though. It is not a religion since the original pagan beliefs were replaced by the state-controlled Christian church (unlike many countries where the Church was an independent powerful player, in russia it was created by the state—two or three times even—always to serve the state interests). It is not the idea of exclusivity: such ideas are usually created to support the nation while in russia it’s mostly used to sacrifice russians for that very idea. There’s a difference “you’re the best so everything belongs to you, you just need to go and take it” and “you’re the best so keep living in shit until you’re sent to die for defending that belief somewhere abroad”. Sure, a deep spirituality of russian people is usually mentioned in connection to that but no concrete examples are ever given.

You know, there exists such thing as russian nationalists whose ideas can be boiled down to “russians are being offended; and usually it’s Britain that offends them by acting as a puppeteer of russian government since long ago”. Even funnier that until very recently they were prosecuted by the government—I suppose not for the incompatibility of views but rather because they formed those views independently instead of following the official guidelines.

I propose a different explanation: because of the vague dynamic community russians lost incentive to work themselves (a lot like with socialistic system: why bother if everybody around belong to the same community and you can benefit from them working while not benefiting from working hard yourself? See kulak for an example of russian peasants who worked slightly better than the rest and what happened to them; russian national symbol should’ve been a crab bucket instead), in the same time they believed they can take anything because they all belong to the same community. And the refusal offends them. The same story with them believing that whatever they sell or give as a gift still belongs to them (so they can always take it back or tell what you can do or not with it). That may also be the reason behind russians ignoring all kinds of agreements—they’ve been trained only to recognize “might makes right” rule. Yet it does not prevent them from trying to take what belongs to somebody else again and again (like Ukraine). Why don’t they stop attempts? Because they essentially live off selling natural resources (back in the day it was wax, fat and furs, nowadays it’s oil, gas and metals) and they need somebody to actually mine those resources (usually foreigners) and when the old sources get depleted of course they want to capture a new source of income.

Now consider what happens when such creature feels that everything should belong to it and denied those things, feels that others are more developed in many aspects (not just, say, advanced electronics, but having a functioning society too), feels that others have no respect for them (the archetypical question of a drunk russian is “do you respect me?” hints on it)? You’ll get a gamut of emotions, from the desire to present themselves as much better than in reality to drag others down by attributing them all your own bad features. That is how we get claims that Europe will freeze without russian gas (even in summer—they really claimed that), the claims about famous russian culture (it was created by a small strata of elites, often not of russian origin; for the most of russian population their own culture remained alien and forced from above; russians love to present exceptional cases as the general rule), the claims about Western level of quality of life (in moscow—do not look at the rural area that lacks gas, sewer system and roads) and evil godless Westerners want to occupy and destroy them (they’ve looked in the mirror while creating this lie).

And that’s how we get ruscism: psychological complexes of something not deserving to be called a nation, which realizes and resents that. Throw in their sociopathic disregard for honouring agreements (nothing demonstrates it better than the Budapest Memorandum but they’ve been inventing pretexts or outright violated international treaties for centuries) and the lack of thinking (critical or otherwise—there are countless examples that the discussions with common russians fail because those accept ideas selectively and refuse to see connections between different facts) and you get the perfect mix for disaster.

The sad thing is that all russians are infected by it in one form or another. Some may demand nuclear holocaust for all countries that do not ally with them, others merely cheer at the news of russian war criminals killing civilians. Some want russia to conquer the whole world (or at least restore its borders to the times of USSR or russian empire), others simply want russia to end war and not get punished for all its war crimes. Some want to destroy USA, others believe that USA will collapse soon anyway (and they all secretly want to move there regardless). Some hate all other nations, others don’t (but still despise Jews, people from Asia and Caucasus).

I think now it’s more or less clear what the idea unites russians and creates ruscism: russians are those who cast away thinking for a feeling of inferiority. Now, what to do with all that? The realistic way is demonstrated by the Ukrainian Army: over two hundred thousand russians will no longer force their opinions onto others. In theory occupation and re-education might work—it worked for Japan which behaved rather similarly in 20th century—but considering the sheer area of russia and the lack of interest I doubt that even China will attempt it. Meanwhile the best you can do is not to listen to russians at all and check the information you get. Keep thinking, that’s what distinguishes a normal human from russian.

rv4enc: magic numbers

Tuesday, May 16th, 2023

While there’s nothing much to write about the encoder itself (it should be released and forgotten soon), it’s worth recording down how some magic numbers in the code (those not coming from the specification) were obtained. Spoiler: it’s mostly statistics and gut feeling.
(more…)

rv4enc: probably done

Saturday, May 13th, 2023

In one of the previous posts I said that this encoder will likely keep me occupied for a long time. Considering how bad was that estimation I must be a programmer.

Anyway, there were four main issues to be resolved: compatibility with the reference player, B-frame selection and performing motion estimation for interpolated macroblocks in them, and rate control.

I gave up on the compatibility. The reference player is unwieldy and I’d rather not run it at all let alone debug it. Nowadays the majority of players use my decoder anyway and the produced videos seem to play fine with it.

The question of motion vector search for interpolated macroblocks was discusses in the previous post. The solution is there but it slows down encoding by several times. As a side note, by omitting intra 4×4 mode in B-frames I’ve got a significant speed-up (ten to thirty percent depending on quantiser) so I decided to keep it this way by default.

The last two issues were resolved with the same trick: estimating frame complexity. This is done in a relatively simple way: calculate SATD (sum of absolute values of Hadamard-transformed block) of the differences between current and some previous frame with motion compensation applied. For speed reasons you can downsample those frames and use a simpler motion search (like with pixel-precision only). And then you can use calculated value to estimate some frame properties.

For example, if the difference between frames 0 and 1 is about the same as the difference between frames 1 and 2 then frame 1 should probably be coded as B-frame. I’ve implemented it as a simple dynamic frame selector that allows one B-frame between reference frames (it can be extended to allow several B-frames but I didn’t bother) and it improved coding compared to the fixed frame order.

Additionally there seems to be a correlation between frame complexity and output frame size (also depending on the quantiser of course). So I reworked rate control system to rely on those factors to select the quantiser for I- and P-frames (adjusting them if the predicted and the actual sizes differ too much). B-frames simply use P-frame quantiser plus constant offset. The system seems to work rather well except that it tends to assign too high quantisers for some frames, resulting in rather crisp I-frame followed by more and more blurry frames.

I suppose I’ll play with it for a week or two, hopefully improving it a bit, and then I shall commit it and move to something else.

P.S. the main goal of NihAV is to provide me with a playground for learning and testing new ideas. If it becomes useful beside that, that’s a bonus (for example, I’m mostly using nihav-sndplay to play audio nowadays). So RealVideo 4 encoder has served its purpose by allowing me to play more with various concepts related to B-frames and rate control (plus there were some other tricks). Even if its output makes RealPlayer hang, even if it’s slow—that does not matter much as I’m not going to use it myself and nobody else is going to use it either (VP6 encoder had some initial burst of interest from some people but none afterwards, and nobody cares about RV4 from the start).

Now the challenge is to find myself an interesting task, because most of the tasks I can think about involve improving some encoder or decoder or—shudder—writing a MOV/MP4 muxer. Oh well, I hope I’ll come with something regardless.

rv4enc: B-frame experiments

Saturday, May 6th, 2023

As I mentioned in the previous post, one of the problems is to find a good motion vector for B-frame interpolated macroblock. Since I had nothing better to do I’ve decided to try motion vector search in the same style as the usual motion estimation: start from the candidate motion vector pair and try adjusting both vectors using diamond pattern (since it’s the simplest one).

The results are not exciting: while it slightly improves PSNR and reduces file size (on lower quantisers), encoding time explodes. My 17-second 320×240 test clip encoded with quant=16 and two B-frames between I/P-frames takes 40 seconds without that option and 136 seconds with it. And while average PSNR improves from 38.0446 to 38.0521, the size decreases from 1511843 bytes to 1507224.

That’s the law of diminishing returns in action. Of course it can be made significantly faster by e.g. using pre-interpolated set of reference frames but why bother in this case? I’ve put this under an option (i.e. be satisfied with the initial guess or try to search for a better pair of motion vectors) but I doubt anybody will ever use it (the same applies to the whole encoder as well).

rv4enc: somewhat working

Wednesday, May 3rd, 2023

I’ve finally managed to implement more or less working RealVideo 4 encoder with all the main features (yeah, I’m also surprised that I’ve got to this stage this fast). As usual, it’s small details that will take a lot of time to make them decent let alone good.

So, what can my encoder actually do now? It can encode video with I/P/B-frames using provided order, it can encode all possible macroblock types and has some kind of rate control.

What it does not have then? First of all, I don’t know yet how it would fare with the original RealPlayer (I also need to modify RMVB muxer to output improper B-frame timestamps and maybe write the additional streaming-related information). Then there’s a question of having a proper rate control. And finally there are a lot of questions related to B-frames.

Currently my rate control is implemented as a system that keeps statistics on how large is on average an encoded frame for a given frame type and quantiser and tries to find the best fitting quantiser. If there’s still no statistics (or not enough of it) I resort to a simpler quantiser guessing, adjusting quantiser depending on how different are the projected and actual frame sizes. Of course it can be tuned to behave better (the question is how though). And I’m not going to touch the two-pass encoding (theoretically it’s rather simple—you log various encoder information in the first pass and use it to select quantisers better in the second part; in practice it means messing with text data and doing additional guesstimates, so pass).

With B-frames there are two main issues to deal with: which frames to select and how to perform motion estimation. I read the first can be achieved by performing motion compensation against neighbouring frames and calculating SATD (often done on scaled-down frames to be faster). The second question is how to search for a bidirectional block vectors. Currently I have a very simple approach: I search for a forward and backward motion vectors independently and check which combination of them works the best. I suspect there may be an approach specifically for weighted bi-directional search but I could not find anything (and I’m not desperate enough to dive into the codebase of MPEG-4 ASP/AVC encoders).

And finally there’s the whole question of quality. I suspect that my encoder is far from being good because it should not merely transform-quantise-code blocks but also perform some masking (i.e. set some higher-frequency coefficients to zero instead of hoping that they’ll be quantised to zero).

So this will be long and boring work…