My prime number sieve is extremely slow even with --release

Question

I have looked at multiple answers online to the same question but I cannot figure out why my program is so slow. I think it is the for loops but I am unsure.

P.S. I am quite new to Rust and am not very proficient in it yet. Any tips or tricks, or any good coding practices that I am not using are more than welcome :)

math.rs

pub fn number_to_vector(number: i32) -> Vec<i32> {
    let mut numbers: Vec<i32> = Vec::new();

    for i in 1..number + 1 {
        numbers.push(i);
    }

    return numbers;
}

user_input.rs

use std::io;

pub fn get_user_input(prompt: &str) -> i32 {
    println!("{}", prompt);

    let mut user_input: String = String::new();

    io::stdin().read_line(&mut user_input).expect("Failed to read line");

    let number: i32 = user_input.trim().parse().expect("Please enter an integer!");

    return number;
}

main.rs

mod math;
mod user_input;

fn main() {
    let user_input: i32 = user_input::get_user_input("Enter a positive integer: ");
    let mut numbers: Vec<i32> = math::number_to_vector(user_input);
    numbers.remove(numbers.iter().position(|x| *x == 1).unwrap());
    let mut numbers_to_remove: Vec<i32> = Vec::new();

    let ceiling_root: i32 = (user_input as f64).sqrt().ceil() as i32;

    for i in 2..ceiling_root + 1 {
        for j in i..user_input + 1 {
            numbers_to_remove.push(i * j);
        }
    }

    numbers_to_remove.sort_unstable();
    numbers_to_remove.dedup();
    numbers_to_remove.retain(|x| *x <= user_input);

    for number in numbers_to_remove {
        if numbers.iter().any(|&i| i == number) {
            numbers.remove(numbers.iter().position(|x| *x == number).unwrap());
        }
    }

    println!("Prime numbers up to {}: {:?}", user_input, numbers);
}

Well, for one you're doing a big ol' `O(m * n)` style loop since `numbers` is a vector. I'd make `numbers_to_remove` a set, and filter `numbers` using that. — AKX, Sep 11 '22 at 20:36
Come to think of it, I don't see a reason for `numbers` to be a real `vec` at all... — AKX, Sep 11 '22 at 20:37
Since the code works as expected, I think this question would be a better match would be [Code Review SE](https://codereview.stackexchange.com/). There are several problems with the code, but they all stem from trying to store a vector of primes and then removing non-primes. You should instead use a `Vec` to store, for each number, whether it is prime or not. — apilat, Sep 11 '22 at 20:48
Thank you AKX and @apilat for your comments! I will try the bool method instead! As I said I am new to Rust and am just using what I know from other languages such as C++ and Java! So my code might not be very rustonic ;) — figgyfarts, Sep 11 '22 at 21:14
@figgyfarts Your approach would be slow in C++ and Java as well. The problem is in the algorithm, not in the choice of language. Look up sieve of Eratosthenes on wikipedia for a description of the actual algorithm. — user4815162342, Sep 11 '22 at 21:32
from the algorithmic point of view, I believe that changing your inner loop to something like `for j in i..(user_input/i) + 1 {` should make your code much faster, drastically bringing down its complexity too. — Will Ness, Sep 13 '22 at 13:23
yeah, it wasn't enough. the code did run faster, but still [was quadratic](https://ideone.com/z0gSeY), [empirically](http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth). the `numbers_to_remove` removal loop looks like it's to blame. I don't know Rust; in C it's easy to do in linear time by advancing along both arrays simultaneously, and writing out the remaining numbers into a new array, the result. or using linked lists where the removal is actually O(1). (the `retain` line isn't needed, with the `N/i` fix, too) — Will Ness, Sep 13 '22 at 19:49
@AKX I've posted an answer which fixes the two loops; the resulting code now runs at algorithmically correct complexity, exhibiting N^1.05 growth rate from 1 to 2 million upper limit. check it out if you feel like it. :) — Will Ness, Sep 13 '22 at 23:45
@apilat even without changing the types, the code can be fixed with relatively small changes. see my answer. :) — Will Ness, Sep 13 '22 at 23:46
@user4815162342 the approach was fine actually, just its coding had to be fixed -- the removal of composites was quadratic, and the enumeration of composites was n^1.5 , whereas both can be linear, as I show in [my answer](https://stackoverflow.com/a/73710084/849891) (well, maybe the enumeration is n log n, I dunno, but that's the inherent math). the fixed code [runs at a fair speed](https://ideone.com/p2sxGl) and a very nice empirical complexity of just `N^1.05`, in the range of 1M...2M upper limit. Yes, typical C++ code with bit array would be much faster in absolute terms. — Will Ness, Sep 15 '22 at 08:13
@WillNess I don't disagree with what you said, I'd just like to point out that "typical C++ code with bit arrays" would as well be typical Rust code. Rust is a systems language, not a pure-fp one, and there is no requirement to solve problems with iterators, filters, etc. We have arrays as well and are not afraid to use them. :) — user4815162342, Sep 15 '22 at 08:28
@user4815162342 all right, this same approach can be coded with that just as well. the algorithm will stay the same. my point was, it _is_ actually a sieve, and pretty close to the Eratosthenes' (i.e. much better than e.g. trial division). so the problem wasn't with the algo, but with the specifics of coding it up. but I guess you've already agreed with that. :) — Will Ness, Sep 15 '22 at 09:02
@WillNess I do agree. The flip side of that coin is that sometimes people call various elegant but inefficient sieves "sieve of Eratosthenes" even when the actual algorithm is very different. This especially happens in the FP world, so much that there's a [whole paper](https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf) devoted to debunking these claims. I haven't studied the OP's code in detail, so I went ahead and assumed it's another example of that confusion. Since you _have_ studied it and determined it's actually pretty close to the real SE, I have no reason to disbelieve you. — user4815162342, Sep 15 '22 at 09:09
@user4815162342 funny you should [mention that paper](https://stackoverflow.com/questions/73710706/can-we-improve-upon-this-primes-sieve-code-from-sicp/73710707#comment130166768_73710707)... :) — Will Ness, Sep 15 '22 at 09:11

Will Ness · Answer 1 · 2022-09-15T08:01:07.097

There's two main problems in your code: the i * j loop has wrong upper limit for j, and the composites removal loop uses O(n) operations for each entry, making it quadratic overall.

The corrected code:

fn main() {
    let user_input: i32 = get_user_input("Enter a positive integer: ");
    let mut numbers: Vec<i32> = number_to_vector(user_input);
    numbers.remove(numbers.iter().position(|x| *x == 1).unwrap());
    let mut numbers_to_remove: Vec<i32> = Vec::new();
    let mut primes: Vec<i32> = Vec::new();     // new code
    let mut i = 0;                             // new code

    let ceiling_root: i32 = (user_input as f64).sqrt().ceil() as i32;

    for i in 2..ceiling_root + 1 {
        for j in i..(user_input/i) + 1 {       // FIX #1:  user_input/i
            numbers_to_remove.push(i * j);
        }
    }

    numbers_to_remove.sort_unstable();
    numbers_to_remove.dedup();
    //numbers_to_remove.retain(|x| *x <= user_input);   // not needed now
    
    for n in numbers {                         // FIX #2:
        if n < numbers_to_remove[i] {          //   two linear enumerations
            primes.push(n);              
        }
        else {
            i += 1;                            //   in unison
        }
    }
    
    println!("Last prime number up to {}: {:?}", user_input, primes.last());
    println!("Total prime numbers up to {}: {:?}", user_input, 
         primes.iter().count());
}

Your i * j loop was actually O( N^1.5), whereas your numbers removal loop was actually quadratic -- remove is O(n) because it needs to move all the elements past the removed one back, so there is no gap.

The mended code now runs at ~ N^1.05 empirically in the 10⁶...2*10⁶ range, and orders of magnitude faster in absolute terms as well.

Oh and that's a sieve, but not of Eratosthenes. To qualify as such, the is should range over primes, not just all numbers.

Anwsonwsymous · Answer 2 · 2022-09-13T16:56:18.130

0

As commented AKX you function's big O is (m * n), that's why it's slow.

For this kind of "expensive" calculations to make it run faster you can use multithreading.

This part of answer is not about the right algorithm to choose, but code style. (tips/tricks)

I think the idiomatic way to do this is with iterators (which are lazy), it make code more readable/simple and runs like 2 times faster in this case.

fn primes_up_to() {
    let num = get_user_input("Enter a positive integer greater than 2: ");
    let primes = (2..=num).filter(is_prime).collect::<Vec<i32>>();

    println!("{:?}", primes);
}

fn is_prime(num: &i32) -> bool {
    let bound = (*num as f32).sqrt() as i32;
    *num == 2 || !(2..=bound).any(|n| num % n == 0)
}

Edit: Also this style gives you ability easily to switch to parallel iterators for "expensive" calculations with rayon (Link)

Edit2: Algorithm fix. Before, this uses a quadratic algorithm. Thanks to @WillNess

edited Sep 13 '22 at 16:56

answered Sep 11 '22 at 23:48

Anwsonwsymous

94
6

1

range expression in primes_up_to function should be `2..=num` – Amin Cheloh Sep 12 '22 at 07:25
Thank you @anwsonwsymous! I will try out this way of doing it and also look into rayon because that looks very interesting! – figgyfarts Sep 12 '22 at 11:26
@figgyfarts you can switch to rayon pretty easy, just add `to_par_iter()` after ranges eg. `(2..*num)` and `(2..=num)`. It should look like `(2..num).to_par_iter()` and so on... – Anwsonwsymous Sep 12 '22 at 14:18
@Anwsonwsymous Ok I added it and got a massive performance increase! Isn't the 2..=num supposed to be for the "let primes = (2..num).filter" part in primes_up_to? – figgyfarts Sep 12 '22 at 15:19
@Anwsonwsymous rayon is incredible! I'm going to use it to speed up all my projects that do number crunching :) – figgyfarts Sep 12 '22 at 15:20
@figgyfarts yes add that equal sign in `let primes = (2..=num)` part – Anwsonwsymous Sep 12 '22 at 16:47
1

looks like your `is_prime` uses a quadratic algorithm, testing each `num` by all numbers below it. only those not above `num`'s square root are actually needed there. – Will Ness Sep 12 '22 at 23:43
@WillNess As I mentioned this is more about style not algorithm. But you are right thanks! – Anwsonwsymous Sep 13 '22 at 15:34
oh-kay, now it runs in N^1.5, approximately. which is much better, and it *is* the limit for trial division, but sieves can be much better. :) by the way, I've fixed the OP code, now it runs at N^1.05, from 1000000 to 2000000 upper limit. (also, about 300x faster for 100000 than before). I've posted an answer with the fixed code -- the main problem was the numbers removal loop. – Will Ness Sep 14 '22 at 01:31

My prime number sieve is extremely slow even with --release

2 Answers2