10

I wrote a very rudimentary piece of code to calculate if a number is prime in Rust (compiled to WASM) and in JavaScript to benchmark the arithmetic performance.

I was fully expecting Rust/WASM to blow away JavaScript. In all other arithmetic benchmarks I've done Rust/WASM seems to have the edge over JavaScript or at least match it. However in this test, JavaScript seems to heavily outperform WASM and I don't really have an explanation to why that is.

Rust Code:

pub fn calculate_is_prime(number: u64) -> bool {
    if number == 1 {
        return false;
    }
    if number == 2 {
        return true;
    }
    for i in 2..number {
        if number % i == 0 {
            return false;
        }
    }
    return true;
}

#[wasm_bindgen]
pub fn bench_rs(max: u64) -> u64 {
    (1..=max).map(|n| calculate_is_prime_rs(n) as u64).sum()
}

JavaScript code:

function calculateIsPrime(number) {
    if (number === 1) {
        return false;
    }
    if (number === 2) {
        return true;
    }
    for (let i = 2; i < number; i++) {
        if (number % i === 0) {
            return false;
        }
    }
    return true;
}

function bench_js(max) {
    let tot = 0;
    for (let n = 1; n <= max; n++) {
      tot += calculateIsPrime(n);
    }
    return tot;
}

let max = 200000;
console.log(`Amount of primes under ${max} is ${bench_js(max)}`);

Basic sample project: https://github.com/Mcluky/Stack-Overflow-Rust-Wasm-Performance-Example

Things I've already checked/done:

  • I made sure to always set the --release flag while building the rust code.
  • Ran the Rust code directly on my machine which is a lot faster than both JS and WASM so I suspect it has something to do with the WASM target.
  • Replaced the integer type (u64) with a floating point (f64) in Rust since JavaScript is using floats but I got similar results.
  • Tested over iterations to make sure it's consistent.
  • Tried while instead of for-in in the Rust version in case it wasn't as optimized as you'd think.
Luke
  • 365
  • 2
  • 10
  • 1
    Thanks for the hint. Unfortunately the while loop doesn't noticeably improve the performance :/ It would have surprised me if it had. – Luke Jul 26 '22 at 09:46
  • 3
    This is most likely an issue with how you do the benchmarking, and so you should provide that code as well. For example, if the benchmarking code is in JS and you call the wasm function from JS, crossing the wasm-JS boundary twice on every iteration could easily dwarf the runtime of the function itself. – glennsl Jul 26 '22 at 09:57
  • @T.J.Crowder "_How are you doing that?_" -> There might of course be better ways to test this but I do two things. I individually test the method multiple times and I also run it in an iteration over a given amount of times so it runs for a couple of seconds or even minutes. Same Browser, same system,... – Luke Jul 26 '22 at 10:00
  • 2
    How are you benchmarking? with a stopwatch? counting in your head? an hourglass? and what are the results. Not in "this is slower than that" .. but actual figures from your benchmarks – Jaromanda X Jul 26 '22 at 10:02
  • I am aware that crossing the bridge might produce some overhead, that's why I am letting it run for multiple seconds or even minutes. But I get results where JavaScript is consistently double the speed (30s compared to 60s). You could almost measure this with an hourglass ;) I don't think crossing the bridge is causing this big of a difference. I also don't experience this issue in the other arithmetic benchmarks that I am doing... It's really just this one test. Unfortunately, I can't share the entire code since it's work related but I can share a basic gist later. – Luke Jul 26 '22 at 10:21
  • @JaromandaX I scrapped together a quick example (I didn't bother to remove the template stuff, I hope that alright): https://github.com/Mcluky/Stack-Overflow-Rust-Wasm-Performance-Example I still get results where JavaScrip is about double the speed (see my comment in the answer from orlp) – Luke Jul 26 '22 at 13:46

1 Answers1

2

I can not reproduce your results on a Ryzen Threadripper 2950x on Windows 10. I added the following functions:

#[wasm_bindgen]
pub fn bench_rs(max: u64) -> u64 {
    (1..=max).map(|n| calculate_is_prime_rs(n) as u64).sum()
}
function bench_js(max) {
    let tot = 0;
    for (let n = 1; n <= max; n++) {
        tot += calculateIsPrime(n);
    }
    return tot;
}

I then compiled with wasm-pack build --release --target web and evaluated in both Google Chrome:

> console.time("rs"); console.log(bench_rs(BigInt(200000))); console.timeEnd("rs");
17984n
rs: 6015.033935546875 ms

> console.time("js"); console.log(bench_js(200000)); console.timeEnd("js");
17984
js: 6017.426025390625 ms

And in Firefox:

> console.time("rs"); console.log(bench_rs(BigInt(200000))); console.timeEnd("rs");
17984n
rs: 6076ms - timer ended

> console.time("js"); console.log(bench_js(200000)); console.timeEnd("js");
17984
js: 6074ms - timer ended
orlp
  • 112,504
  • 36
  • 218
  • 315
  • Thank you very much for the effort you made to measure the performance as well! Unfortunately I can't replicate your results. I set up a very basic project with the same setup that you have but I still get results that JS is on average almost double the speed than Rust/WASM (in my case js: ~8s rust: ~15s on linux, xeon 8260 4cores). I scrapped together a quick example (I didn't bother to remove the template stuff, I hope that alright): https://github.com/Mcluky/Stack-Overflow-Rust-Wasm-Performance-Example – Luke Jul 26 '22 at 13:42
  • Can you see something obvious that I might have done wrong? Your results seem oddly close to each other... Do you mind checking if you're not actually calling JS twice? – Luke Jul 26 '22 at 13:44
  • @Mcluky I have downloaded and ran your code, with exactly the same results: https://i.imgur.com/pgDIP8v.png – orlp Jul 26 '22 at 13:49
  • What the... https://imgur.com/a/WCXFWaw I really don't understand this... – Luke Jul 26 '22 at 13:52
  • Do you mind sharing what Rust version you are on? – Luke Jul 26 '22 at 14:09
  • @Mcluky Rust 1.62.1 – orlp Jul 26 '22 at 14:25
  • FWIW, I also get essentially identical times for the Rust and JS versions, in my case about 5600ms for either version whether I run it in Chrome's V8 or Firefox's SpiderMonkey. Linux x64. – T.J. Crowder Jul 26 '22 at 15:19
  • Do you have any idea as to what the reason for this might be...? I've now tried it on multiple machines (windows and linux) and even built the project on a windows computer with the same version of rust. I'm really clueless at this point... Can maybe one of you send me their binary wasm file so that I can compare it to mine? In theory, it should be the same right? – Luke Jul 26 '22 at 16:34
  • 1
    @Mcluky - [Here's my .wasm converted to .wbt](https://pastebin.com/zfe5YbYE) by [this online tool](https://webassembly.github.io/wabt/demo/wasm2wat/). Perhaps if you convert yours to .wbt comparing them may help. Hope you figure it out! – T.J. Crowder Jul 27 '22 at 08:14
  • @T.J.Crowder Thank you very much for providing the wasm file! Not that surprising the content is exactly identical to what I'm getting. What makes things even more interesting though is that someone reported even better results for rust/wasm on an m1 mac: https://twitter.com/iambenwis/status/1552311602094809089?t=rwBa9MOOFgVSbRpS4CC8Gg&s=19 I think this actually might be heavily cpu architecture dependent but I'll test this out a bit more. – Luke Jul 27 '22 at 18:38
  • 1
    As another data point for the CPU architecture difference, on my desktop PC which has an AMD CPU, I get nearly identical times for JS and WASM. However, on my laptop which has an Intel Xeon CPU, WASM takes twice as long as JS. Interestingly, the WASM time varies significantly on the Intel CPU, sometimes taking more than *three* times as long as JS. The JS execution time only has a variation of a few hundred milliseconds. – Herohtar Jul 27 '22 at 19:42
  • 1
    @Herohtar - Interesting. My "basically the same" results above are on an AMD Ryzen 7 PRO running Linux. I tried on an Intel Core i7 machine running windows and got wildly different numbers -- about 4600-4900ms for JavaScript, but 12800-15300ms for Wasm. – T.J. Crowder Jul 28 '22 at 09:20
  • @Mcluky - See above. But also, the github code you provided is using `u64`, not `f64`. You said you'd tried `f64`, but did you do it everywhere (including the loop vars)? It seems to involve a fair number of changes. (I tried, but ran into Rust issues I don't know how to solve, because I don't do Rust/Wasm [yet!].) And similarly, did you try a `while` rather than the `sum`? (I'm basically just trying to make things as like-for-like as possible.) :-) – T.J. Crowder Jul 28 '22 at 09:42
  • All of that said: you want idiomatic Rust code to perform well on Intel CPUs. :-) – T.J. Crowder Jul 28 '22 at 09:43
  • @T.J.Crowder It's a bit difficult for me to test code changes right now because I only have access to Intel CPU's right now . I trying to organize/rent different kinds of CPUs and run all my arithmetic benchmarks again. I suspect that the results may vary for each of them as well. I'm considering writing a blog post as soon as I have a few results together because I have not seen this documented so far. I will keep you updated! – Luke Jul 28 '22 at 10:22
  • @T.J.Crowder Can you maybe reopen the question so that I can summarize my results and answer it? So that future developers that stumble up on this might find an explanation more easily. – Luke Jul 28 '22 at 10:24
  • @Mcluky - *"It's a bit difficult for me to test code changes right now because I only have access to Intel CPU's right now."* That seems okay, since that seems to be where the problem is. *"...so that I can summarize my results and answer it"* Have you found an answer then? (I can't unilaterally reopen the question, I can just vote to reopen it. Which I just tried to do but the site says I've already done that. Weird.) – T.J. Crowder Jul 28 '22 at 10:55
  • @Mcluky can you summarize your results please? – WPWoodJr Feb 07 '23 at 21:11