So while russia is trying to commit political suicide by recognizing its war officially and making it a criminal offence to evade it (or disagreeing with the official course in general), here’s a text I wanted to write for a long time but finished only recently.
In general I’m eager to look at some computer architecture and see how multimedia software can be run there (even if I’m no Måns). For a long time my primary development machine was PowerPC-based (before I could buy some decent x86_64-based laptop), I played with ARMv7 and NEON years before Raspberry Pi was created (but who remembers Beagle Board nowadays?), I had two laptops with Chinese MIPS inside (and tried optimising for Loongson 2 SIMD too), I own ARM64-based box too (and did some work on it as well). I’d like to try RISC-V hardware but the state of RISC-V gives me no excitement and here I’ll try to explain why.
Before anything else I want to mention that RISC-V is essentially a continuation of MIPS and MIPS had too many versions and implementations so it’s impossible to make a joke about it since no matter how you try there’s some MIPS CPU existing already except maybe MIPS-NG. This is annoying.
Now I should tell about my use case. I’m positive that RISC-V may have good applications in microcontroller world for its simplicity in the core version and ISA being under free license is a good thing overall. For me it’s the opposite: I work on multimedia software so I’m more interested in high-performance CPUs (I’d say 800MHz or more and multi-core if possible) with SIMD (vector extension falls under this definition as well).
I can’t talk about technical deficiencies of ISA since I haven’t tried it yet, but on a higher level RISC-V has three problems: fragmentation, fragmentation and fragmentation.
When RISC-V was designed, it was made too flexible which causes the same annoyance as for ARM boards support: you have too many possible components to pick from so various hardware vendors make ARM SoCs out of very wide variety of components that are very annoying to support. For RISC-V you have the same situation with instruction sets: there’s your core set, nice and compact. Do you also want to perform multiplication? That’s extra extension. Do you want to use bitcounting tricks (very useful for working with variable-length codes)? That’s extra extension. Do you want to use SIMD? That’s a can of worms (more about it later).
Okay, suppose I’ve determined which extensions I want, now how can I map that to CPU and/or detect whether a CPU running it has required extensions? On ARM you have three (or four since ARMv9) profiles so you know what to expect from which one of them, on MIPS the extensions were usually implemented as a vendor-specific co-processor instructions that were possible to test for presence (i.e. whether a certain instruction faulted or not), x86 has CPUID instruction with various capabilities listed. I’ve tried looking into RISC-V ISA specification to see how you can detect various extensions presence but it’s not immediately clear. Of course you can point out that I can easily find it out from the CPU model—I just need to look that it’s e.g. RISC-V RV64GCVWTFBBQ and map that alphabet soup to the existing extensions. But how would a binary compiled for a CPU with the certain extensions behave on a CPU lacking some of them? I don’t know and that’s part of the problem.
And now for SIMD. It’s another case where it hasn’t lifted off and already it’s a confusing mess. There’s P extension for traditional style SIMD that nobody cares about and there’s V extension for the original style of SIMD operating on variable-length vectors, just like Seymour Cray intended. Now there are boards with RISC-V chips that implement V extension. The catch is that most of them use XuanTie C906 core that implements V extension version 0.7 or so (while the current version is 1.0 and the ratified version will be 2.0). Those versions are mostly but not exactly compatible and raise the same questions: how can I detect which version of V extension is available and how the binary compiled for CPU with Vv0.7 will behave on CPU with Vv1.0? Of course this is not as bad as the variety of SIMD instructions on x86 and AMD64 (especially thanks to Intel brilliant strategy of AVX512 fragmentation) but there’s not much time passed for RISC-V either.
So, neither of the issues I listed here can’t be resolved with some simple albeit radical changes: introduce sane profiles, do something about extensions zoo and especially vector extensions. Maybe SiFive will finally produce a nice 64-bit RISC-V chip with proper V extension version 2.0 which I can buy and use. Then I shall be able to really evaluate RISC-V for its advantages and weaknesses. Before that I’ll pass because it looks like doing work that you have to redo later (like migrating from Rust 2018 edition to Rust 2021 edition but worse).
Interesting analysis. I always get scared when I think of RISC ASM because of the whole pipelining issue that I learned about in Computer Organization class (i.e., the data might not be available when you expect it).
On the multimedia front, more realistically, I would expect any RISC-V SOC to come packaged with dedicated silicon to do 4K HEVC @ 120 fps without any main CPU involvement, just like basically every chip does nowadays.
Nowadays RISC mostly boils down to having dedicated instructions for loading/storing instead of allowing memory operands for different instructions. And data may be not available no matter what computer architecture it is (except for the Turing machine of course).
As for those dedicated hardware decoders – of course they exists but they’re even less uniform in the interface, trickier to interact with and have various limitations too. Plus I prefer to implement my own decoders anyway 😉
I don’t know much about RISC-V, but I do know that there’s profiles: https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc
I do feel that it’s too early to target RISC-V with any significant thought, with all the changes/developments going on, but perhaps things will stabilize over time.
Having written a bunch of AVX512 code, I don’t actually feel the ‘fragmentation’ is that big of an issue. I don’t think they needed to split things up as much as they did (and in some ways, didn’t do enough splitting, e.g. making AVX512F the base, as opposed to AVX512F *or* AVX512VL), but I mostly find that you just check for the stuff you’re using, and let the compiler verify whether you’re correct.
> Nowadays RISC mostly boils down to having dedicated instructions for loading/storing instead of allowing memory operands for different instructions
Unless you want atomics =P
Yes, there are profiles (in proposal state) but they’re confusing – you can’t immediately tell the difference between RVA20 and RVA22 sets. ARM at least has profiles defined by the purpose – applications, real-time and microcontrollers. Also those defined profiles look like they tend to become alphabet soup as well with names like RVA24S64MT.
As for the atomics, it’s usually still few dedicated instructions for load/store plus fence on RISC while x86 had LOCK prefix for various instructions.