Rustc_codegen_cranelift (cg_clif) is an alternative backend for rustc that I have been working on for the past two years. It uses the Cranelift code generator. Unlike LLVM which is optimized for output quality at the cost of compilation speed even when optimizations are disabled, Cranelift is optimized for compilation speed while producing executables that are almost as fast as LLVM with optimizations disabled. This has the potential to reduce the compilation times of rustc in debug mode.
I recently looked back at the notes for the design meeting (meeting proposal) about integrating cg_clif into rustc. I noticed that several of the challenges that needed to be solved have since been solved. Because of this I decided to give an overview of the achievements in the past six months and what the current challenges are.
Achievements in the past six months
Fixing an ABI incompatibility for proc-macros (see next section) combined with several small fixes to the 128bit support made it possible to compile rustc using cg_clif.
- issue #743: Compile rustc using cg_clif
- commit cd684e3: Fix saturated_* intrinsics for 128bit ints
- commit ef4186a: Use Cranelift legalization for icmp.i128
- commit 8d639cd: Test signed 128bit discriminants
- commit e87651c: Add test for SwitchInt on 128bit integers
Proc-macro support has been implemented by fixing an ABI incompatibility.
- #1068: Pass ByRef values at fixed stack offset for extern “C”
- wasmtime#1559: SystemV struct arguments
The new style
asm! inline assembly and
global_asm! have been implemented on Linux by compiling a separate object file using an assembler and linking the main object file for the codegen unit and the assembly object file together. On macOS linking both object files together gives a linker error. Linking both object files together is necessary as rustc expects a single object file for each codegen unit.
The cpuid x86 instruction is now emulated using code that pretends the current CPU is an Intel cpu with SSE and SSE2 support. This fixes ppv-lite86 and by extension c2-chacha and rand. It is not yet possible to use the inline assembly support as corearch uses
llvm_asm! for the cpuid invocation. I didn’t implement this as it is currently being replaced with
- #1070: Emulate cpuid
Stdarch has been changed to use constify on all x86 intrinsics that use
rustc_args_required_const. This was necessary to support
simd_extract based intrinsics.
- stdarch#876: Constify all x86 rustc_args_required_const intrinsics
- issue #669: Support simd_insert platform intrinsic
Fixing linking with lld and sysroot and executable size
I assumed the sysroot and executables are much bigger for cg_clif than cg_llvm because of missing optimizations. While fixing linking with lld I discovered that for executables most of this is caused by per function sections not being used by cg_clif. Using this does significantly reduce the size of executables at the cost of significantly slowing down the linker. For this reason I put it behind the
CG_CLIF_FUNCTION_SECTIONS env var.
- #1083: Fix lld
- wasmtime#2212: Fix relocated readonly data in custom sections
- wasmtime#2218: cranelift-object: Support per function sections
rust#77170 changed the MIR of
<Box<F> as FnOnce>::call_once such that it doesn’t need an alloca anymore. 27a46ff removed the hack to workaround the missing alloca support for this.
- commit 27a46ff: Rustup to rustc 1.44.0-nightly (45d050cde 2020-04-21)
Box<dyn FnOnce>respect self alignment
Rust test suite
There has been significant improvements on the amount of passing rustc tests with the previously mentioned #1068 fixing 82 tests. Except for abi incompatibilities all miscompilations seem to be fixed. There are some unimplemented features, but those are not very important for most use cases.
- issue #381: Make rustc test suite pass
Many intrinsics remain unimplemented.
- issue #171: std::arch SIMD intrinsics
There are many remaining ABI incomptibilities. I will need to rework cg_clif to reuse
- #10: C abi compatability
Cleanup during stack unwinding on panics
Cranelift currently doesn’t have support for cleanup during stack unwinding.
- wasmtime#1677: Support cleanup during unwinding
Atomic instructions are currently emulated using a global lock. This is very inefficient and only works when pthreads is available. The new style backend for Cranelift support native atomic instructions. There are several missing features before I can switch cg_clif to use the new style backends.
- wasmtime#2077: Implement Wasm Atomics for Cranelift/newBE/aarch64.
- wasmtime#2149: This patch fills in the missing pieces needed to support wasm atomics…
The plan for integration with rustc was to use
git subtree. This git command currently has a bug for which a fix has not yet been upstreamed. It would be nice if for example
git submodule could be used for the time being instead.
- rust-clippy#5565: git subtree crashes: can’t sync rustc clippy changes into rust-lang/rust-clippy
- compiler-team#270: Integration of the Cranelift backend with rustc
While there have been several PR’s by other people like @osa1, @vi, @spastorino and @CohenArthur, I am the only person who has contributed more than a few changes to cg_clif.
How can I help?
The easiest way to help is by trying to compile and run any project and reporting any issues. You could also try to fix one of the above issues or any other issues in the issue tracker. They are not easy though. Contributing to Cranelift will also help with cg_clif.
I would like to thank each and every person that has supported me while working on cg_clif for the past 2 years. Whether by contributing, donating or simply mentioning cg_clif.
I would also like to thank @eddyb and @cfallin for reviewing a draft of this post.