RealAudio Cook aka RealOpus

October 14th, 2018

Let’s start with a bit of history since knowing how things developed often helps to understand how they ended up like they are.

There is an organisation previously called CCITT (phone line modem owners might remember it) later renamed to ITU-T. It is known for standardisation and accepting various standards under the same confusing name e.g. PCM and A-law/mu-law quantisation are G.711 recommendation from 1972 while G.711.0 is lossless audio compression scheme from 2009 and G.711.1 is a weird extension from 2008 that splits audio into two bands, compresses low band with A- or mu-law and uses MDCT and vector quantisation on top band.

And there is also a “family” of G.722 speech codecs: basic G.722 that employs splitting audio into subbands and applying ADPCM on them; G.722.1 is a completely different parametric bit allocation, VQ and MDCT codec we discuss later; G.722.2 is a traditional speech codec better known as AMR-WB.

So, what’s the deal with G.722.1? It comes from PictureTel family of Siren codecs (which later served as a base for G.719 too). Also as I mentioned before this codec employs MDCT, vector quantisation and parametric bit allocation. So you decode envelope defined by quantisers, allocate bits to bands depending on those (no, it’s not 1:1 mapping), unpack bands that are coded using vector quantisation dependent on amount of bits and perform MDCT on them. You might be not familiar but this is exactly how certain RealAudio codec works. And I don’t think you can guess its name even if I mention that it was written by Ken Cooke. But you cannot say nothing was changed: RealAudio codec works with different frame sizes (from 32 to 1024 IIRC), it has different codebooks, it has joint stereo mode and finally it has multichannel coding mode based on pairs. In other words, it has evolved from niche speech codec to general purpose audio codec rivalling AAC and it was indeed a codec of choice for RealMedia before they have finally switched to AAC and HE-AAC years later (which was the first time for them using open standard verbatim instead of licensing a proprietary technology or adding their own touches on standards drafts as before—even DNET had special low-bitrate mode).

Now let’s jump to 2012 and VideoLAN Dev Days ’12. I gave a talk there about reverse engineering codecs (of course) and it was a complete failure so that was my first and last public talk but that’s not important. And before me Timothy Terriberry gave an overview of Opus. So I listen how it combines speech and general audio codec (like USAC which you might still not know under its commercial name xHE-AAC)—boring, how speech codec works (it’s Skype SILK codec they dumped to open source at some point and like with Duck TrueMotion VP3 before, Xiph has picked it up and adopted for own purposes)—looks like typical speech codec that I can barely understand how it functions, and then CELT part comes up. CELT is general audio codec developed by Xiph that is essentially what your Opus files will end as (SILK is used only at extremely low bitrates in files produced by the reference encoder—or so I heard from the person implementing a decoder for it). And a couple of months before VDD12 I actually bothered to enter technical details about Cook into MultimediaWiki (here’s edit history if you want to check that)—I should probably RE some codec and write more pages there for the old times’ sake. So Cook design details were still fresh in my mind when I heard about CELT details…

So CELT codes just single channels or stereo pairs—nothing unusual so far, many codecs do that. It also uses MDCT—even more codecs do that. It codes envelope, uses parametric bit allocation and vector quantisation—wait a bit, I definitely heard about this somewhere before (yes, it sounds suspiciously like ITU G.719). Actually I pointed out that to Xiph guys (there was Monty present as well) immediately but it was dismissed as being nothing similar at all (“we transmit band energies instead of relying on quantisers”—right, and quantisers in audio are rarely chosen depending on energy).

Let’s compare the coding stages of two codecs to see how they fail to match up:

  1. CELT transmits band energy—Cook transmits quantisers (that are still highly correlated with band energy) and variable amount of gains to shape output frame in time domain;
  2. CELT transmits innovation (essentially coefficients for MDCT minus some predicted stuff)—Cook transmits MDCT coefficients;
  3. CELT uses transmitted band energy and bits available for innovation after the rest of frame is coded to determine number of bits for each band and mode in which coefficients are coded (aka parametric bit allocation)—Cook uses transmitted quantisers and bits available after the rest of frame is coded to determine number of bits for each band and mode in which coefficients are coded;
  4. CELT uses Perceptual Vector Quantization (based on Pyramid Vector Quantizer—boy, the won’t cause any confusion at all)—Cook uses fixed vector quantisation based on amount of bits allocated to band and static codebook;
  5. CELT estimates pitch gains and pitch period—that is a speech codec stuff that Cook does not have;
  6. CELT uses MDCT to restore the data—Cook does the same.

Some of you might say: “Hah! Even if it matches at some stages actual coefficient coding is completely different!! And you forgot that CELT uses range coder too.” Well, I didn’t say those two formats were exactly the same, just that their design is very similar. To quote the immortal words from Bell, Cleary and Witten paper on text compression, the progress in data compression is mostly defined by larger amounts of RAM available (and CPU cycles available). So back in the day hardly any audio codec could afford range coder (invented in 1979) except for some slow lossless audio coders. Similarly PVQ was proposed by Thomas Fischer in 1986 but wasn’t employed because it was significantly costlier than some fixed codebook vector quantisation. So while CELT is undeniably more advanced than Cook, the main gains are from using methods that do the same thing more effectively (at expense of RAM and/or CPU) instead of coming up with significantly different scheme. An obligatory car analogy: claiming that modern internal combustion engine car is completely new invention compared to Ford Model T or FIAT 124 because they have more bells and whistleselectronics even while principal scheme remains the same—while radically new car would be an electric one with no transmission or gearbox and engines in each wheel (let’s forget such scheme is very old too—electric cars of such design roamed Moon in 1970s).

So overall, Opus is almost synonymous with CELT and CELT has a lot of common in design with Cook (but greatly improved) so this allows Cook to be called RealOpus or Opus of its era.

BTW when implementing the decoder for this format in Rust I’ve encountered a problem: the table for 6-bit stereo coupling was never tested because its definition is wrong (some code definitions repeating with the same bit lengths) and looks like the first half of it got corrupted. Just compare for yourselves.

libavcodec version (lengths array added for the reference):

static const uint16_t ccpl_huffcodes6[63] = {

static const uint8_t ccpl_huffbits6[63] = {

NihAV corrected version (extracted from the reference of course):

const COOK_CPL_6BITS_CODES: &[u16; 63] = &[
    0xFFFE, 0x7FFE, 0x3FFC, 0x1FFC, 0x0FFC, 0x07F6, 0x07F7, 0x07F8,
    0x07F9, 0x03F2, 0x03F3, 0x03F4, 0x03F5, 0x01F0, 0x01F1, 0x01F2,
    0x01F3, 0x01F4, 0x00F0, 0x00F1, 0x00F2, 0x00F3, 0x0070, 0x0071,
    0x0072, 0x0073, 0x0034, 0x0035, 0x0016, 0x0017, 0x0004, 0x0000,
    0x000A, 0x0018, 0x0019, 0x0036, 0x0037, 0x0074, 0x0075, 0x0076,
    0x0077, 0x00F4, 0x00F5, 0x00F6, 0x00F7, 0x01F5, 0x01F6, 0x01F7,
    0x01F8, 0x03F6, 0x03F7, 0x03F8, 0x03F9, 0x03FA, 0x07FA, 0x07FB,
    0x07FC, 0x07FD, 0x0FFD, 0x1FFD, 0x3FFD, 0x3FFE, 0xFFFF

NihAV, RealMedia, Rust and Everything Else

October 13th, 2018

Looks like it’s been about two months since I last wrote anything about NihAV but that does not mean I did not have anything to write about. On the contrary, I’m glad to report about significant progress in RealAudio support.

Previously I’ve reported about RealVideo 3 and 4 support (as for RealVideo 1/2 and ClearVideo before), so video part was covered quite well but audio part was missing and I went on to rectify the situation.

Now NihAV supports RealAudio 1.0 (speech codec), RealAudio 2.0 (speech codec), RealAudio DNET (a bit about it later), RealAudio 4.0 (speech codec from Sipro), RealAudio Cook (this one deserves a separate post so the next one should be about this codec) and RealAudio Lossless. So there are only three codecs missing now: RealAudio 8 (ATRAC3), RealAudio 9/10 (AAC) and RealVideo 6(HD). Of course I’m going to add support for those as well.

This is actually a good time to implement those. As you might know, there is a Holy Trinity of Licensors: D.vX, D*lby and DT$. They are famous for ‘nice’ licensing terms. While I’ve never had to deal with them, I’ve heard from people who did that they like licensing single product they’re most famous for at outrageous prices (i.e. it’ll cost you a magnitude more per unit using their technology than e.g. H.264 decoder) and it’s a viral license too because if you sell stuff not oriented for consumers then you have to force your customers into the same deal (it’s GPL—Greedy Private License) and you have to report your sales to them for obvious reasons. Funny how two of the companies were bought out already. Now let’s look at them in some details:

  • D.vX This one is remarkable since it licensed the product it had nothing to do with (aka M$MPEG-4 adapted for non-ASF containers and MPEG-4 ASP). At least it seems hardly relevant now unless I dig out some old movies.
  • D*lby This one is mostly known (outside cinema equipment) for codec with several names: ATSC A/52, RealAudio DNET, ETSI TS 102 366, D*lby Digital and even something you can make out of letters A C and 3 (I heard rumours that it does not like its trademarks mentioned so I’d better avoid directly naming it). At least the last patents for that format has expired and support for it can be implemented freely. And it also owns a company that manages licensing of AAC. Fun fact is that patents for MPEG2 NBC are expired so I can implement AAC-LC decoder just fine but that does not stop them for licensing it. How they do it? By refusing to license the separate parts and forcing a whole package of AAC-LC, HE-AACv1, HE-AACv2 and xHE-AAC onto you. I guess if the situation won’t change in twenty years all current stuff will expire but they’ll still license it along with Ultra-Enhanced-Hyper-Expanded-Radically-Extended High-Efficiency AAC (which will have nothing to do with all those previous formats).
  • DT$ A company similar to D*lby and its (former?) prime competition. Also known for single format with many extensions making it essentially a homebrew AAC. At least it seems to be exclusively DVD/Blu-ray format and I’m satisfied with Xine for playing the former and avoiding the latter completely.

And I want to talk a bit more about my RealAudio DNET decoder. Internally it’s called ts102366 for obvious reasons and I have just a primitive implementation for it (i.e. it seems to work and should handle multichannel fine but no extended features). The extension for more than 5.1 channels also seems to be HD-DVD/Blu-ray only so I don’t care, it’s quite rare in RealMedia format and other containers seem to contain it as contiguous stream so I’d need to introduce support for NAElementaryStream in demuxing code and also proper parser to split it into frames. Not worth the effort for me at this moment. Another fun fact is that bitstream comes in 16-bit words that can have any endianness. In my case I just had to detect the proper endianness from first two bytes and simply initialise bitstream reader in BE or LE16 mode depending on it (again, it’s funnier with DT$ format where you have three different bitstream reading modes and you might need two modes simultaneously in some cases; again, good thing I don’t have to care about that stuff). Also it’s still one of two codecs I currently have that support multichannel audio (Cook is the second of course and AAC will be third).

And finally some words about Rust issues I had to deal with.

Rust as a language is more or less fine but compiler sucks. I’ve ran into several issues while writing code.

First, I had a fixed array of Codebooks to initialise in RALF decoder (one of 15 codebooks, another one of 125 codebooks and yet another one of 10×11 codebooks). If I use simply mem::uninitialized() with filling it up it works fine. In debug mode. In release mode it segfaults at the end. Probably I should’ve used ptr::write() instead of assigning and it would work fine but I gave up and used a vector instead of an array even if it’s not as efficient. Obviously it’s all my fault and not Rust issue but still that was weird.

Second, when I tried to create a generic codebook reader that would accept table of codes of any primitive type (u8, u16 or u32) I ran into funnier issue of Rust compiler spewing weird errors like “cannot convert u16 to u32 because it’s not a primitive type”. Obviously it’s my mistake and it’s caught by a tool (that is still not in stable) so the developers don’t care (yes, Luca even bothered to file an issue on that). Still, I’d rather have a clearer error message in that case (e.g. “… because it’s X and not a primitive type”).

And finally, an example that is definitely rustc stupidity and not mine. Again, developers don’t consider this to be an issue but I do (and Luca seemed to agree with me since he opened an issue about it). Essentially, there is a thing called DCE (dead code elimination), so when compilers see that certain block won’t be executed they might print a warning and just check inside code for syntactic validity. Current rustc might ignore condition value and optimise code inside even if it clearly makes no sense (to the point where it crashed because of that on some nightly version, see the issue for details). And while you argue that one should not write such code, I had quite plausible use case for it: a macro that took 2- or 3-element array and did something to its values so if third value was present it had to do something special with it. But of course compilation failed because you tried to do if ARR.len() > 2 { a = ARR[2]; } with two-element array. But when I tried to check whether I got indexing correct by using large constants as indices, cargo check passed just fine—probably because const propagation did not go that deep inside my code (it was in a function called from a long chain in some sub-sub-sub-module and standalone example errors out fine). This feels quite unpolished to me.

Oh, and final final fun thing: the calls like would still fail borrow check probably because they can’t (I guess) formalise function calling convention i.e. “if function is called then first its arguments are evaluated and copied if needed in certain order, then function address is evaluated and called with the arguments”. BTW you still have the situation like this:

struct Foo { foo: u8 }
impl Foo {
    fn bar(&mut self) -> u8 { += 1; }

fn fee(a: u8, b: u8) {
    println!("{} {}", a, b);

fn main() {
    let mut foo = Foo { foo: 42 };

And if you don’t know what’s wrong here I’ll tell you: in C argument evaluation is implementation-defined because back in the day there were very different calling conventions and thus compiler needed to start with evaluating from last argument to first to store them in order instead of widespread pushing arguments in order to stack. So depending on ABI the function would be called either as fee(43, 44) or as fee(44, 43).

Now I see two ways out of it: either detect such situation where the same object is mutably called several times and give an error or, which is better IMO, make formal calling convention so the code won’t be undefined. And fix borrow checker while doing that.

Overall, Rust is a nice experience so far since it allows code to structure much better but sometimes you hit such silly issues that spoil all the fun.

Anyway, next post should be about RealAudio Cook, the Opus of its era.

Some Notes on Saarland Railways

October 3rd, 2018

Since today is the state holiday (some time ago two Germanies united into one—which looks more and more like DDR for some reason) why not look at Zoidberg of German lands—Saarland? Well, you might have many reasons (first, it being Saarland) but today I’ve completed my voyage on all of their accessible railways and hence this post.

First, a bit of history. As you all remember, after World War II Germany was split into four occupation zones and while I haven’t heard anything in particular about British occupation zone, the rest of occupation forces were behaving not nice at all: USA installed their military bases everywhere (and most of them are still there—at least it meant less military expenses for West Germany back in the day), USSR tried to convert its piece of Germany into a copy of itself (partly successfully, hopefully it will recover) and France was not satisfied with mere occupation and also tried to seize the part of Germany as its own but it bit more than it could chew and so back in 1957 Saarland was reunited with the rest of Germany (and that day is the state holiday too but I doubt many think of 1st of January as of Saarland reunification day).

Saarland still honours France

Second, a bit of railway network overview. Essentially you can think about it as a cross: there’s a main East-West line going from Mannheim to Trier (or Alt-Chemnitz) via Homburg and Saarbrücken, there’s a North-South line going from Bad Kreuznach to Saarbrücken, there’s a line going South from Saarbrücken to France (and another one served by tram but more about it later), there is a branch Dillingen—Niedaltdorf, there’s a line from Rohrbach to Pirmasens, there’s line Trier—Perl—Metz that goes partly through Saarland and there are several parallel lines connecting Homburg and Saarbrücken. Let’s count: Homburg—Rohrbach—Saarbrücken (that’s what trains from Mannheim to Saarbrücken use), Homburg—Neunkirchen—Saarbrücken (part of it is Nahetalbahn to Bad Kreuznach), Homburg—Neunkirchen—Merchweiler—Saarbrücken (serviced by regional trains Homburg—Illingen and Saarbrücken—Lebach) and finally there’s Homburg—Neunkirchen—Lebach—Saarbrücken via tram line that goes all the way from Lebach to Saarbrücken to Saargemünd (or Sarreguemines as some people write it).

Yes, there’s a tram line in Saarland that essentially crosses half of it. And it’s impossible to confuse it since there’s only one tram line and one tram route in Saarland.

Also I’ve found mentions of three museum lines but looks like only one is functioning: Ottweiler—Schwarzerden line (or Ostertalbahn for short). And I’ve tried it as well. Unlike many other museum lines, this one uses diesel locomotives from the 1960s (but hopefully they’ll manage to rebuild the steam locomotive from the parts they have one day). It was what can be experienced in Ukrainian regional trains—going at about 30km/h while sitting on wooden benches and enjoying looking at the nature outside. At least they boast that they work in any weather (while other museum lines close in Autumn they keep running trains in winter too).

There are many weird things there I’d like to talk about but I’ll leave them to the time when I finish travelling on all railways of Rheinland-Pfalz (should be done next year unless they decide not to open Zellertalbahn again) but here are some of them for now.

First, the train service Saarbrücken—Lebach-Jabach. Fischbachtalbahn (Saarbrücken—Wemmetsweiler) and Primstalbahn from Wemmetsweiler to Illingen are electrified (in Illingen only track 41 is electrified, track 51 is not). Then only a bit of track at Lebach is electrified but about fourteen kilometres in-between are not. We had similar situation here with Bruhrainbahn (between Graben-Neudorf and Germersheim) being not electrified so train Karlsruhe—Mainz ran mostly on electrified rails but still had to be a diesel one. At least this was fixed in 2011 by electrifying the missing piece.

Second, it’s the only tram line in Germany I know that has exit directions repeated in French too.

And third, to make Saarland feel even more like Switzerland, they have the same cryptic booking system: when I bought a ticket from Saargemünd to Lebach it offered me to choose one of three or four possible alternatives—just like buying a rail ticket in Switzerland! Come to think of it, Swiss rail system is exactly like German regional system:

  • Choosing route is the same;
  • German general rail tickets have a whole day of validity (or more for longer distances). German regional tickets are valid for just a couple of hours after purchase—and same in Switzerland (unless it’s some snowy route that might be closed for days);
  • When I bought a ticket from Schaffhausen to Zürich (two different kantons) the ticket also listed zones—like some German regional tickets do;
  • Like with German regional trains, the type does not really matter. It may be S-Bahn, RegioBahn, RegioExpress or InterRegioExpress—the ticket is valid regardless. Same in Switzerland: the same ticket valid for any kind of train and trains change classes during the travel (i.e. train Basel—Chur was labelled as InterCity up to Zürich and InterRegio after that, the difference is only how many intermediate stops it makes);
  • And finally, the famous Swiss train punctuality. Well, it’s a known effect that regional trains have much better punctuality than long-distance ones (and all trains in Switzerland are essentially slow regional trains).

So despite all local jokes about Saarland being very backward place (some even call it “rear end of Germany”) it’s quite European place in some aspects. And remember that it has a real Schengen border (i.e. it borders with Luxembourg where town of Schengen known for some treaty is located).

Dingo Pictures: The Missing Masterpiece

September 29th, 2018

Originally I wanted to to write about NihAV progress but some kind soul has uploaded the final missing piece of Dingo Pictures art collections so I have no other choice but to talk about it.

So, Arischa the Little Witch (…on the visit to the Magic Forest).
Read the rest of this entry »

A Bit on Swedish Railway Network

September 22nd, 2018

I wanted to write this post for several months since in July I finally had a chance to travel on some of the important Swedish railways.

Well, as anybody knows, I love Sweden and railways. And Swedish railways too. And obviously I’d like to ride them all and recently I’ve moved much closer to that goal.

There are this important railways in Sweden (sorry if I forgot some but this list should cover the most important ones):

  • Ostkustbanan (Stockholm—Uppsala—Gävle—Sundsvall)
  • Ådalsbanan+Botniabanan (Sundsvall—Kramfors—Umeå)
  • Norra stambanan (Gävle—Ånge)
  • Stambanan genom övre Norrland (Ånge—Bräcke—Vännäs—Boden)
  • Malmbanan (Luleå—Boden—Kiruna—Narvik)
  • Mittbanan (Sundsvall—Ånge—Östersund—Storlien—Hell—Trondheim)
  • Inlandsbanan (Gällivare—Östersund—Orsa—Mora)
  • Dalabanan+Siljansbanan (Uppsala—Borlänge, Borlänge—Mora)
  • Bergslagsbanan (Gävle—Borlänge—Frövi)
  • Västra stambanan (Stockholm—Göteborg)
  • Södra stambanan (Stockholm—Malmö)
  • Mälarbanan (Stockholm—Västerås—Örebro)
  • Svealandsbanan (Stockholm—Eskilstuna—Arboga)
  • Värmlandsbanan (Laxå—Charlottenberg, further to Oslo)
  • Kust till kust-banan (Göteborg—Alvesta—Kalmar)
  • Västkustbanan (Lund—Göteborg)
  • Jönköpingsbanan (Nässjö—Falköping)

And I want to talk about those railways and my experience there.
Read the rest of this entry »

NihAV: Some Progress to Report!

August 24th, 2018

Finally the large chunk is finished: NihAV has finally got support for RealVideo 3 and 4!

Since I’ve learned a great deal more about codecs since the last time I wrote RealVideo 3/4 decoder (and specifications for both were leaked—they have mistakes but still clarify some things), I was able to write a new decoder that also seems to reconstruct frames better.

Some words on the design: I’ve split it into several parts as usual—common RV3/4 code, RV3/4 DSP, RV3 bitstream parser, RV3 DSP and RV4 bitstream parser and DSP. That’s the approach I’ve been using before and I’ll probably use it in future decoders as well. The only more or less interesting thing is how I did weighted motion compensation: instead of temporary buffer I allocate 16×16 frame that I use for storing temporary results and which is used later to average results (since motion compensation routines in RealVideo 3 and 4 differ while weighted averaging is the same it makes sense to split it into separate operation).

And now for the juicy part: benchmarks and performance. I’ve tested one of the RealVideo 4 trailers (namely swordfish.rmvb) and avconv -threads 1 -cpuflags 0 decodes it in 15 seconds, nihav-tool needs almost 25.
Read the rest of this entry »

NihAV: Progress Report

July 2nd, 2018

I’m still working (barely) on NihAV and I’ve managed to make my code decode both RealVideo 3 and 4. It’s not always correct, especially B-frames and some corner cases, but at least it produces a sane picture in most cases.

And this time I’d like to write about disadvantages of writing motion compensation functions in Rust instead of C.
Read the rest of this entry »

#chemicalexperiments — Cream

June 18th, 2018

So it has come to this. Let’s talk about a stuff one usually finds in sweets: various kinds of cream (and my experience with it).

I can divide the cream I’ve encountered or made so far into three categories:

  1. Swedish cream;
  2. Lazy cream;
  3. Custards.

Swedish cream is very easy to make: whip cream, optionally sprinkle cinnamon on top. It’s found in virtually every Swedish cake and serves as a base for some other cream variants. In Germany it’s common to use Sahnesteif—essentially a mix of starch and dextrose—that makes whipped cream stay thick and not runny longer.

Lazy cream is essentially a mix of some dairy product with powdered sugar and maybe something else for flavour (I use lemon juice): it can be butter, mascarpone, quark or something else. You simply mix those two ingredients together and use immediately. I believe the other term for this kind of cream is butter-cream.

And custards is the trickiest one since you have to cook it. It’s essentially a mix of egg yolks and milk with some thickening agent (can be starch or less commonly gelatine). When making it you have to keep in mind that if you simply put yolks into the hot milk they’ll curdle and you’ll end with a very runny omelette so you have to be extra careful and mix them (first you mix yolks with sugar and starch) by pouring a thin stream of one ingredient into another and mixing (some say you should first add some hot milk to yolks and then pour the mix back to milk, others claim it’s enough to pour yolks into milk). Afterwards you have to let it cool in a sealed container and maybe mix with whipped cream. It can be used in tarts, cakes, smaller pastry or eaten as it (preferably with something else though like berries or biscuits).

There’s a variation of it called Bavarian cream which you make by mixing yolks and milk, adding gelatine and mixing with whipped cream after it’s half-set (and then waiting even more hours until it’s fully set). The result is good as a standalone dessert but I heard it can be used in cakes too.

Overall I find all those cream varieties good but it’s better to eat them with something else and in moderation (or you’ll end having my shape).

NihAV: progress report

June 10th, 2018

Well, since I had no incentive to work on NihAV and recently the weather is not very encouraging for any kind of intellectual activity there was almost no progress. And yet now I have something to write about: NihAV has finally managed to decode non-trivial (i.e not fully black) RealVideo 3 I-frame properly (i.e. without any visible distortions). Loop filter is still missing but it’s a start. And it’s not a small feat considering one has to implement both coefficients decoding and intra prediction. So essentially it’s just motion vector juggling and motion compensation are all the things that are missing for P- and B-frames support. Maybe it will go faster from here (but most likely not).

And since doing that involved rewriting some C code into Rust here are some notes on how oxidising went:

  • match is a nice replacement for the cases when you have to partly remap values—in my case I had to adjust intra prediction directions in case top or left or bottom reference were missing and that means changing three or four values into other values, match looks more compact than several } else if (itype == FOO) { and does not lose readability either;
  • while in C foo = bar = 42; is a common thing, Rust does not allow this (I can understand why) and I’m surprised I ran into it only now (with intra prediction functions that assign the same calculated value to several output pixels at once);
  • loops in Rust are fine for basic use but when you need to deal with something more complex like for (i = 0; i < block_size; i += 4) or for (i = 99; i > 0; i--) you need either to write a simpler loop and remap indices inside or to remember it’s Rust and permute range in less intuitive ways like for i in (0..block_size).filter(|x| x&3 == 0) and for i in (1..99+1).rev(). While this works and even somewhat conveys the meaning it’s a bit unwieldy IMO;
  • and it might be a bit too esoteric but looks like I cannot easily write fn clip_u8(val: N) -> u8 that would take any primitive numeric type as input, do comparisons inside and return value either clipped to converted to u8. The best answer on how to do it I found was “you can’t, it’s against Rust practices”. I don’t need it much and I care even less, so I’ll just mark it as a neutral language feature and forget about it.

And now the small but constantly irritating thing: arrays. While slices are nice and easy to use (including extracting sub-slices), in my area I often need a slice with arbitrary start and end bounds. To clarify my use case: quite often you need a piece of memory that’s addressable with both positive and negative indices and those make sense on certain interval.

One of such common arrays is clipping array which essentially takes input index and returns it clipped usually to 0-255 range. So you have part [-255..-1] filled with zeroes, [0..255] filled with values in the same range and [256..511] filled with 255. I repeat, such clipping table is very common and useful thing that’s currently not easy to implement in Rust.

Another less common case is the block of pixels we process may require information from its top, left and top-left neighbours—and those are addressed as src[-stride + i], src[-1 + stride*i] and src[-stride - 1]. Or a whole frame of GDI-related codec (no, not from Westwood) or even simple BMP/DIB that stores lines upside-down so after you process line 0 you have to move to line -1.

I currently deal with it by keeping an additional variable pointing to the current position in array that I use as a reference and from which I can subtract other numbers if needed, but it’s a bit clunky and error-prone. Since Rust checks indices on slice access I wonder if extending it to work with e.g. negative indices is possible. IIRC FORTRAN and Pascal allowed you to define an array starting with arbitrary index, it might be possible in Rust too.

Oh well, I’ll just keep using my approach meanwhile and waiting to see what rust-av does in this regard.

Rust: Lifetimes Sugar

May 27th, 2018

One of the Rust language features is explicit object lifetimes that help compiler correctly track memory usage and free objects without using garbage collector. A neat idea but it leads to lifetime specifiers being used everywhere including places where compiler should be smart enough to deal with them without explicit mentions in every place.

Maybe I’m using Rust wrong but in most of the cases I create objects that have no need for lifetime specifier or the objects that have the same lifetime for both its members and itself. Thus I argue that in addition to generic lifetime specifier 'a (or whatever the name you give it) and obviously named 'static there should be 'self that specifies the lifetime to be exactly the same as the object itself.

So, instead of current:

struct Foo<'a> {
  myref: &'a [u8],
  subobj: Bar<'a>,

impl<'a> Foo<'a> {
  pub fn new(myref: &'a [u8], subobj: Bar<'a>) -> Self { ... }

it should be possible to write:

struct Foo {
  myref: &'self [u8],
  subobj: Bar,

impl Foo {
  pub fn new(myref: &'self [u8], subobj: Bar) -> Self { ... }

I am not sure whether compiler needs to perform some additional things in such objects compared to objects without no lifetime specifier but it should be easy to assign proper lifetime after parsing the structure definition anyway and I’m pretty sure the compiler does something like this anyway.

And I see only these reasons why this has not been done yet:

  • Considerations for compiler simplicity (i.e. parsing process should be kept as simple as possible)—I still think it should be easy for compiler to recognize the lifetime definition by the time structure declaration parsing is over and it’s used externally (i.e. for objects using this one);
  • Considerations for language clarity and consistency (i.e. it’s immediately obvious when you look at the object that it deals with lifetimes but not with the proposed change). I’d argue that explicit lifetimes should be kept for complex cases only, when you have to juggle lifetimes from several complex sources, and the objects with references not outliving themselves should be fine;
  • Simple oversight (i.e. “we did not think of such simplification”) or developers’ bias (i.e. “we got used to writing lifetime specifiers everywhere that we didn’t think it annoys anybody”). You should be able to guess what I have to say about such argument.

So all in all I’d be happy to either hear why it cannot be done (beside the compatibility with the existing code) or see it implemented. But most likely this will be ignored (and I’m fine with that too).