How to prevent function calls from being optimized away?

Question

How can I ensure that a function with no side effects gets executed and doesn't get optimized away in stable Rust?

Is there an attribute combination I could use, or must I call another function with side effects? In the case where a function call is necessary, does the Rust standard library provide cheap function that is guaranteed to not be optimized away?

Why would you NOT want a function call with no side effect to be optimized away? If it's useless, good riddance! — Matthieu M., Mar 19 '17 at 19:19
For the same reasons underlying https://stackoverflow.com/questions/7083482/how-to-prevent-gcc-from-optimizing-out-a-busy-wait-loop ; Side-effect-less functions are not always without side effects. Constant-time functions (e.g. busy-waits) are a prominent example of this. Space-bar-activated heater programs are another. — Doe, Mar 19 '17 at 21:46
I know, the question is more what is **your** usecase? For example, achieving constant-time functions or busy-waiting are two very different goals. Constant-time functions in general require assembly. — Matthieu M., Mar 20 '17 at 07:44
busy-wait loops and constant-time functions, while different in goals, need the same basic feature: to control compiler assumptions. As for specifics of my usecase, ignore it. I argue that focusing on the asker's task rather than the question at hand runs counter to the reasons behind this site's existence, and is a genuine disservice to those who come later expecting an answer (something I've been bitten by many times with rust questions). I think rust users should aim to have questions and answers of quality that you expect when searching for C. I'm tired of asker-specific workarounds. — Doe, Mar 20 '17 at 08:43
*I think rust users should aim to have questions and answers of quality that you expect when searching for C. I'm tired of asker-specific workarounds.* => However, sometimes there just is a MUCH better answer for a specific problem than there is for a generic problem. For constant time functions, as used in cryptography, attempting to divine the number of clock cycles that the resulting optimized assembly would get is just plain unfeasible (and brittle), which is why constant time functions are written from assembly blocks and manually optimized. — Matthieu M., Mar 20 '17 at 09:45
What about developers who want to implement busy-loops and space-bar-heaters in rust? I understand where you're coming from - I genuinely do, but I didn't ask how to implement cryptographic constant-time functions. People who inevitably end up on this page won't be looking for this information either (well, now they might, seeing that this discussion devolved towards crypto). The page I linked to above demonstrates what people like me seek: A question with a nice direct answer. Again, I appreciate what you and the rust community are trying to do, but I believe it's causing more harm than good. — Doe, Mar 20 '17 at 10:08
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/138539/discussion-between-doe-and-matthieu-m). — Doe, Mar 20 '17 at 13:20

Leonora Tindall · Answer 1 · 2017-03-20T18:31:13.890

#[no_mangle]will currently do this, but that may change.

#[no_mangle]
pub fn do_what_i_say_dammit(x: i64) -> i64 { x*x }

To clarify (from that post):

My mental model is that symbols are owned by rustc by default (e.g., if the symbol is private, rustc can emit a differently-typed "arg-promoted" symbol instead of the expected one, as long as it handles it correctly), and #[no_mangle] transfers ownership of the symbol to the programmer.

Now, because ownership is transferred to the programmer, rustc's unspecified symbol mangling scheme can't be used, so the symbol is left unmangled.

Having an unmangled rustc-owned symbol makes pretty much no sense (you can't actually use it because the compiler owns it) - so #[no_mangle] implies #[linker_owned]. There is no loose #[linker_owned] because nobody implemented it.

Edit: Here's a simple example.

It doesn't seem to force a call at all. See is.gd R6i36H – Doe Mar 19 '17 at 22:13 — Doe, Mar 19 '17 at 22:13

score 2 · Answer 2 · answered Mar 19 '17 at 22:28

There is test::black_box() (link to old docs) which is still unstable (as is the whole test crate). This function takes a value of an arbitrary type and returns the same value again. So it is basically the identity function. "Oh well, now that's very useful, isn't it?" you might ask ironically.

But there is something special: the value which is passed through is hidden from LLVM (the thing doing nearly all optimizations in Rust right now)! It's truly a black box as LLVM doesn't know anything about a piece of code. And without knowing anything LLVM can't prove that optimizations won't be changing the program's behavior. Thus: no optimizations.

How does it do that? Let's look at the definition:

pub fn black_box<T>(dummy: T) -> T {
    // we need to "use" the argument in some way LLVM can't
    // introspect.
    unsafe { asm!("" : : "r"(&dummy)) }
    dummy
}

I'd be lying if I were to pretend I understand this piece of code completely, but it goes something like that: we insert empty inline assembly (not a single instruction) but tell Rust (which tells LLVM) that this piece of assembly uses the variable dummy. This makes it impossible for the optimizer to reason about the variable. Stupid compiler, so easy to deceive, muhahahaha! If you want another explanation, Chandler Carruth explained the dark magic at CppCon 2015.

So how do you use it now? Just use it for some kind of value... anything that goes through black_box() needs to be calculated. How about something like this?

black_box(my_function());

The return value of my_function() needs to be calculated, because the compiler can't prove it's useless! So the function call won't be removed. Note however, that you have to use unstable features (either the test crate or inline asm to write the function yourself) or use FFI. I certainly wouldn't ship this kind of code in a production library, but it's certainly useful for testing purposes!

trying to compile `asm!(..)` directly, gives `expected token: ','` .. any idea why? — d9ngle, Jul 08 '23 at 19:50

score 2 · Answer 3 · answered Nov 14 '21 at 21:58

As long as std::hint:black_box is still unstable, you could implement something similar (but probably less efficient) yourself using std::ptr::read_volatile. As with std::hint::black_box, calling this with an otherwise unused value should prevent that value from being optimized away:

fn black_box<T>(dummy: T) -> T {
    unsafe {
        std::ptr::read_volatile(&dummy)
    }
}

score 1 · Answer 4 · answered Jul 09 '23 at 05:40

1

In 1.66, you can use:

core::hint::black_box(&dummy)
core::ptr::read_volatile(&dummy)

answered Jul 09 '23 at 05:40

d9ngle

1,303
3
13
30

How to prevent function calls from being optimized away?

4 Answers4