What optimisation possibilities I miss in Rust

Since a certain friend of mine keeps asking what features I need in Rust and then forgets the answer, here I decided to write it all down. Hopefully it will become outdated sooner than later.

And I’d like to start with some explanations and conditions. I develop a certain multimedia project so I have certain common flows (e.g. processing 16×16 macroblocks in a frame) and I’d like to be able to optimise for them. Also I do not like to use the nightly/unstable version of Rust (as those unstable features may take an extremely long time to hit stable and they change in the process, as it happened to asm!{} support to give one example). And finally I do not accept the answer “there’s a crate X for that”—out of design considerations I prefer to avoid external dependencies (short explanation: those get out of control fast; my encoder and player projects depend only on my own crates for doing everything but the player additionally pulls sdl2 dependency and suddenly it’s 33 crates instead of 19; IIRC with a newer version of sdl2 crate the total number gets to fifty).

Anyway, here are the features I miss with some explanations why that should be relevant probably not just to me.

Better data annotation

As I mentioned in the beginning, one of my common flows is processing macroblocks in frame. That means dividing frame plane into strips of 16 lines and doing something with a block at certain offset inside that strip using a step larger than that block width. So I’d like to be able to tell the compiler that it needs just one check before the per-line iterator to tell it that since I access only first N elements from M lines of size S then data array size (M-1)*S+N would be enough for all such cases and the loop can be unrolled and checks for premature loop end and out-of-bounds access can be dropped.

Okay, I understand that it’s a bit too tricky and .chunks_no_less_than() would be rather monstrous but what about a common way to tell that the slice is long enough? It’s the related scenario: I have some part of the line and I’m working only with first eight pixels. So what is the proper way to tell the compiler it’s long enough? I know at least three solutions:

  1. let _ = line[7]; (looks nice and simple but does compiler take the hint?);
  2. assert!(line.len() >= 8); (it’s a bit clearer on meaning but the same concern stays);
  3. let arrayref = <&[u8;8]>::try_from(&line[..8]).unwrap(); (the last time I checked if the slice is longer than required the conversion will fail so it’s an additional check).

And how to extend that to, say, .chunks() iterator to tell the compiler that the loop should run exactly N times and not merely “no more than N”? I’m too lazy to test but I hope rustc is smart enough to make a constant loop when the input is a fixed-size array.

Better assembly support

The alternative is to do various critical bits in inline assembly but Rust disappoints there as well. Writing a single function with inline assembly is nice (even nicer than in C), but at least my needs require writing about the same function for different input sizes (e.g. performing the same block interpolation for 16×16 or 4×4 blocks). And writing such templates is inconvenient at best and next to impossible at worst.

First of all, it’s impossible to generate a new function name by pasting its base name and size together without an external crate (and the documentation for an experimental std::concat_idents states explicitly that it’s not usable for that purpose).

Second, asm!{} currently does not support taking constants or function names as its parameters (both are being worked on in nightly but who knows when they’ll be in stable). You can somewhat work around it but it’s very inconvenient and there may be not enough free registers to pass input constants.

Third, the current Rust macro system is not well-adapted for writing inline assembly templates. By that I mean I don’t know a good way to make macros e.g. insert some instructions into asm!{} body depending on some external condition (nor make use of e.g. “.if {size} == 8” assembly directive inside the code). From what I heard macro_rules! and asm! can’t play together well because of the way they both are implemented so I don’t think this will ever be resolved (but being able to invoke a bit of internal assembly macro programming would be enough).

The sad thing is that writing portable x86 assembly is tricky thanks to the different ABIs so stand-alone assembly needs some macro definitions in order to understand how the parameters are passed to a function and what registers should be saved if they’re touched. This immediately makes you wish to have a C compiler instead. So with Rust it would be possible to work around those annoyances by making the Rust compiler deal with the proper function prologues and register names and just write the assembly code you need. But the current shortcomings make it rather hard to write code that references external variables and functions let alone templating. And don’t propose intrinsics—I’ve tried them as well and found horrible.


And that’s about it. Since I can’t remember any other annoying things it means they are not important. But being not able neither to tell the compiler that it should be able to optimise these specific parts of code is sad. Not being able to write them in assembly is doubly so.

Comments are closed.