Rust: Not So Great for Codec Implementing « Kostya's Boring Codec World

Rust: Not So Great for Codec Implementing

Disclaimer: obviously it’s my opinion, feel free to prove me wrong or just ignore.

Now I should qualify for zoidberg (slang name for lowly programmer in Rust who lives somewhere in a dumpster and who is also completely ignored—perfect definition for me) I want to express my thoughts about programming experience with Rust. After all, NihAV was restarted to find out how modern languages fare for my favourite task and there was about one language that was promising enough. So here’s a short rant about the aspects of this programming language that I found good and not so good.

Good things

Modern language features: standard library containers, generics, units and their visibility etc etc. And at least looks like Rust won’t degrade into metaprogramming language any time soon (that’s left for upcoming Rust+=1 programming language);
Reasonable encapsulation: I mean both (sub)modules organisation and the fact that functions can be implemented just for some structure;
Powerful enums that can act both as plain C set of values and also as tagged objects, e.g. the standard Result enum has two values—Ok(result) and Err(error) where both result and error are two different user-defined types, so returned value can contain either while being the same type (Result);
More helpful error messages (e.g. it tries to suggest a correction for mistyped variable name or explains an error a bit more detailed). Sure, Real Programmers™ don’t need that but it’s still nice;
No need for dependency resolving: you can have stuff in one module referencing stuff in another module and vice versa at the same time, same for no need
Traits (standard interfaces for objects) and the fact that operations are implemented as specific traits (i.e. if you need to have a + b with your custom object you can implement std::ops::Add for it and it will work). Also it’s nice to extend functionality of some object by making an implementation for some trait: e.g. my bitstream reader is defined in one place but in another module I made another trait for it for reading codebooks so I can invoke let val = bitread.read_codebook(&cb)?; later.

Unfortunately, it’s not all rosy and peachy, Rust has some things that irritate me. Some of them are following from the advantages (i.e. you pay for many features with compilation time) and other are coming from language design or implementation complexity.

Irritating things that can probably be fixed

Compilation time is too large IMO. While the similar code in Libav is recompiled in less than a second, NihAV (test configuration) is built in about ten seconds. And any time above five seconds is irritating to wait. I understand why it is so and I hope it will be improved in the future but for now it’s irritating;
And, on the similar note, benchmarks. While overall built-in testing capabilities in Rust are good (file it under good things too), the fact that benchmarking is available only for ~~limbo~~ nightly Rust is annoying;
No control over allocation. On one hoof I like that I can not worry about it, on the other hoof I’d like to have an ability to handle it.
Poor primitive types functionality. If you claim that Rust is systems programming language then you should care more about primitive types than just relying on as keyword. If you care about systems programming and safety you’d have at least one or two functions to convert type into a smaller one (e.g. i16/u16 -> u8) and/or check whether the result fits. That’s one of the main annoyances when writing codecs: you often have to convert result into byte with range clipping;
Macros system is lacking. It’s great for code but if you want to use macros to have more compact data representation—tough luck. For example, in Indeo3 codebooks have sequences like (a,b), (-a,-b), (b,a), (-b,-a) which would be nice to shorten with a macro. But the best solution I saw in Rust was to declare whole array in a macro using token tree manipulation for proper submacro expansion. And I fear it might be the similar story with implementing motion compensation functions where macros are used generate required functions for specific block sizes and operations (simple put or average). I’ve managed to work it around a bit in one case with lambdas but it might not work so well for more complex motion compensation functions;
Also the tuple assignments. I’d like to be able to assign multiple variables from a tuple but it’s not possible now. And maybe it would be nice to be able to declare several variables with one let;
There are many cases where compiler could do the stuff automatically. For example, I can’t take a pointer to const but if I declare another const as a pointer to the first one it works fine. In my opinion compiler should be able to generate an intermediate second constant (if needed) by itself. Same for function calling—why does bitread.seek(bitread.tell() - 42); fail borrow check while let pos = bitread.tell() - 42; bitread.seek(pos); doesn’t?
Borrow checker and arrays. Oh, borrow checker and arrays.

This is probably the main showstopper for implementing complex video codecs in Rust effectively. Rust is anti-FORTRAN in a sense that FORTRAN was all about arrays and could operate arrays safely while Rust safely prevents you from operating arrays.

Video codecs usually operate on planes and there you’d like to operate with different chunks of the frame buffer (or plane) at the same time. Rust does not allow you to mutably borrow parts of the same array even when it should be completely safe like let mut a = &mut arr[0..pivot]; let mut b = &mut arr[pivot..];. Don’t tell me about ChunksMut, it does not allow you to work with them both simultaneously. And don’t tell me about Bytes crate—it should not be a separate crate, it should be a core language functionality. In result I have to resort to using indices inside frame buffer and Rc<RefCell<...>> for frames themselves. And only dream about being able to invoke mem::swap(&mut arr[idx1], &arr[idx2]);.

Update: so there’s slice::split_at_mut() which does some of the things I want, thanks Tomas for pointing it out.

And it gets even more annoying when I try to initialise an array of codebooks for further user. The codebook structure does not implement Clone because there’s no good reason for it to be cloned or copied around, but when I initialise an array of them I cannot simply declare it and fill the contents in a loop, I have to resort to unsafe { arr = mem::uninitialized(); for i in 0..arr.len() { ptr::write(&arr[i], Codebook::new(...); } }. I know that if there’s an error creating new element compiler won’t be able to ensure that it drops only already initialised elements but it’s still a problem for compiler not being smart enough yet. Certain somebody had an idea of using generator to initialise arrays but I’m not sure even that will be implemented any time soon.

And speaking about cloning, why does compiler refuse to generate Clone trait for a structure that has a pointer to function?

And that’s why C is still the best language for systems programming—it still lets you to do what you mean (the problem is that most programmers don’t really know what they mean) without many magical incantations. Sure, it’s very good to have many common errors eliminated by design but when you can’t do basic things in a simple way then what it is good for?

Annoying things that cannot be fixed

type keyword. Since it’s a keyword it can’t be used as a variable name and many objects have type, you know. And it’s not always reasonable to give a longer name or rewrite using enum. Similar story with ref but I hardly ever need it for a variable name and ref_<something> works even better. Still, it would be better if language designers picked typedef instead of type;
Not being able to combine if let with some other condition (those nested conditions tend to accumulate rather fast);
Sometimes I fear that compilation time belongs to this category too.

Overall, Rust is not that bad and I’ll keep developing NihAV using it but keep in mind it’s still far from being perfect (maybe about as far as C but in a different direction).

P.S. I also find the phrase “rewrite in Rust” quite stupid. Rust seems to be significantly different from other languages, especially C, so while “Real Programmers can write FORTRAN program in any language” it’s better to use new language features to redesign interfaces and make new overall design instead of translating the same mistakes from the old code. That’s why NihAV will lurch where somebody might have stepped before but not necessarily using the existing roads.

This entry was posted on Monday, July 31st, 2017 at 2:06 pm and is filed under Rust, Useless Rants. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

15 Responses to “Rust: Not So Great for Codec Implementing”

Tomas Sedovic says:

August 1, 2017 at 6:04 am

Unless I’m misunderstanding here, you can safely split an array into two simultaneous mutable borrows:

https://doc.rust-lang.org/std/primitive.slice.html#method.split_at_mut

It’s a method, not a custom systax, but it’s in the standard library.
whitequark says:

August 1, 2017 at 6:33 am

> No control over allocation

Fallible allocation support for standard collections is being worked on by the libs team *this very minute*.

> If you care about systems programming and safety you’d have at least one or two functions to convert type into a smaller one (e.g. i16/u16 -> u8) and/or check whether the result fits.

Definitely!

use std::convert::TryFrom;
println!(“{:?} {:?}”, u8::try_from(10i32), u8::try_from(1000u32))

Not stabilized yet but it will be soon.

> Macros system is lacking. It’s great for code but if you want to use macros to have more compact data representation—tough luck.

You could do that easily with procedural macros, to be stabilized sometime this year, I think. More powerful than C’s macros, although perhaps less elegant for this particular case.

> Also the tuple assignments. I’d like to be able to assign multiple variables from a tuple but it’s not possible now. And maybe it would be nice to be able to declare several variables with one let;

Sure is:

let (foo, mut bar) = (1, 2);

> Same for function calling—why does bitread.seek(bitread.tell() – 42); fail borrow check while let pos = bitread.tell() – 42; bitread.seek(pos); doesn’t?

Non-lexical borrows aka MIR borrowck. This is an area of active work right now.

> Not being able to combine if let with some other condition (those nested conditions tend to accumulate rather fast);

https://github.com/rust-lang/rfcs/issues/929?
Kostya says:

August 1, 2017 at 7:02 am

But it’s still looks fitting for my goals. Thanks!
Kostya says:

August 1, 2017 at 7:11 am

Hopefully you see why most it was in “hopefully can be fixed” category. It’s nice to hear that it’s being worked on but I’m not using unstable features on principle.

As for let, I meant let foo; let bar; if baz { (foo, bar) = get_tuple(42); } else { foo = 4; bar = 2; }
Generally useless but might eliminate some pointless temporary variables in some cases.
Luca Barbato says:

August 1, 2017 at 7:43 am

Meanwhile I opened https://github.com/rust-lang/rfcs/issues/2092 to discuss my idea to have closures to fill arrays.

Let see if it gets enough traction 🙂
PEPP says:

August 1, 2017 at 7:50 am

Does this work for you?

let (foo, bar) = if baz { get_tuple(42) } else { (4,2) };
Kostya says:

August 1, 2017 at 7:57 am

It does, but I want to assign a tuple value to some independent variables as a tuple (and not at creation time too).
Phlosioneer says:

August 1, 2017 at 10:32 am

> And speaking about cloning, why does compiler refuse to generate Clone trait for a structure that has a pointer to function?

IIRC this is only true for some closures. There are three types of closures: FnOnce, FnMut, and Fn.

FnOnce(data) function pointers are, as the name implies, only intended to be run once. Therefore, clone makes no sense.
FnMut(&mut data) function pointers can be used multiple times. They take an owning reference to their data, so they cannot be cloned.
Fn(&data) function pointers can be used multiple times, as long as the data still exists. They can be cloned, as long as the lifetime is preserved – therefore, the lifetime parameter must be explicit. CURRENTLY, however, there is a bug where fn(&’a data) implements Copy but not Clone. (see https://github.com/rust-lang/rust/issues/24000). You can manually implement Clone for your type and Copy the function pointer; for now, #derive(Clone) isn’t up to the task.

See also:
https://users.rust-lang.org/t/is-fnonce-mut-equals-to-fnmut/10024/3
Kostya says:

August 1, 2017 at 11:04 am

I meant
struct Foo { bar: fn (&[i16]), }

So probably the bug you’re talking about.
Manishearth says:

August 1, 2017 at 11:14 am

> For example, I can’t take a pointer to const but if I declare another const as a pointer to the first one it works fine.

You probably want to use `static`, not `const`. Rust’s `const` is more like C’s #define, it defines a compile time constant. Rust’s `static` is a general constant global variable. (There’s also the rarely-used unsafe static mut for a mutable global variable).

> Not being able to combine if let with some other condition

Use `match` with a pattern guard. If let is sugar over a match

> And only dream about being able to invoke mem::swap(&mut arr[idx1], &arr[idx2]);.

Slices have a `swap(idx1, idx2)` method. You could also use split_at_mut to construct this safely.

> on the other hoof I’d like to have an ability to handle it.

You can? You’re free to call malloc and free when you want.

There is work going on to make the stdlib datastructures have better APIs for dealing with this directly (and exposing the stdlib allocator).

> And it gets even more annoying when I try to initialise an array of codebooks for further user

https://github.com/Manishearth/array-init

Not everything has to be in the stdlib. This is why `unsafe` exists.

> Also the tuple assignments.

`let (a,b) = (1,2)`

> Poor primitive types functionality.

Not that it can help you yet, but TryFrom is coming.

> Macros system is lacking.

Not that it can help you yet, but proper procedural macros are coming

> Same for function calling—why does bitread.seek(bitread.tell() – 42); fail borrow check while let pos = bitread.tell() – 42; bitread.seek(pos); doesn’t?

Not that it can help you yet, but non lexical lifetimes are coming
Manishearth says:

August 1, 2017 at 11:27 am

Oh, sorry, I didn’t realize that you were looking for multi-var mutation. No, that doesn’t work, you can only declare and initialize multiple variables at once.

FWIW the `fn(&u8)` not implementing clone thing is a bug which probably will get fixed at some point.

> Don’t tell me about ChunksMut, it does not allow you to work with them both simultaneously.

Also, ChunksMut totally does let you work with things simultaneously. In fact, Rust’s trait system is not expressive enough to be able to have the streaming analog of the Iterator trait. Not only is ChunksMut an iterator which can be iterated whilst holding on to previously yielded references, but this is true for *everything that implements Iterator*.

For the StreamingIterator equivalent (e.g. when operating on a stream of data and you want to be able to deallocate the backing store of previously yielded data before going to the next) you’d need the ability to say `trait StreamingIterator {type Item; fn next(&’a mut self) -> Item }`. The `type Item` syntax is something that comes from the “generic associated types” RFC that has been accepted but is not yet in Rust. It enables you to say “I have an associated type with a lifetime, and I intend to use that lifetime in my trait signatures”.

It is of course possible to implement a blocking ChunksMut or in general a streaming iterator in Rust, it’s just not (yet) possible to _abstract_ over all of these with a trait like Iterator.

But this means that anything that implements Iterator can be called multiple times without having to relinquish hold of previously yielded values.
Kostya says:

August 3, 2017 at 1:33 am

> You probably want to use `static`, not `const`. Rust’s `const` is more like C’s #define, it defines a compile time constant. Rust’s `static` is a general constant global variable. (There’s also the rarely-used unsafe static mut for a mutable global variable).

It was a bit confusing to me indeed but hopefully I’ll learn the difference. Also it was more like let var = &othermod::CONST; not working while declaring temporary variable helped. It’s my fault anyway.
Kostya says:

August 3, 2017 at 1:42 am

> Also, ChunksMut totally does let you work with things simultaneously.

My problem with ChunksMut is their unwieldiness (.next().unwrap() looks better hidden in a loop 😉 but the real complications come from splitting data at non-regular intervals (IIRC it’s possible with lambda function defining the range somehow but it’s not nice). I usually split video buffer as “NxM luma plane, then two N/2xM/2 chroma planes and then optional NxM alpha plane”. So .split_at_mut() should work fine and intuitive unlike chunking.
Ingvar says:

August 3, 2017 at 1:21 pm

It sounds more like you want a proper parser which would allow you to define own rules while still returning slices of the original collection. Generic stdlib methods can’t cover all the needs of the parsers, so I’d suggest to take a look at nom.
Kostya says:

August 3, 2017 at 2:34 pm

Thanks for the suggestion but I prefer to use basic language functionality (built-in features and std crate) wherever possible. I don’t like it to degenerate into typical node.js project (but neither I want it to become a monolithic monster like so many other projects; hopefully I’ll find a perfect balance).

And I shan’t look at nom in principle, I leave that task to rust-av developers.