1

I've made a prime finder in C++ that writes primes into a .txt file, but it crashed after finding the 102,144,001th (2,084,058,601). It also stores the found primes in a vector, so it won't have to divide by every single number lesser than the currently checked number's square root, only the prime ones. It resizes the vector that stores the primes after every (10,240n+1)th found prime, and 102,144,001 is 10,240n+1, so it crashed while resizing. I use a vector unsigned __int64, so it used about 780 megabytes when it crashed, but I have 8 GB RAM. Should I use vector.reserve()? (I don't want another crash after 50 minutes...)

user1792131
  • 37
  • 1
  • 6
  • 32-bit or 64-bit operating system? – Vaughn Cato Nov 23 '12 at 22:43
  • 7
    Try using a deque, it'll probably be much faster as well and I suspect memory allocator limits. – jthill Nov 23 '12 at 22:46
  • 1
    If you know up ahead that you are using huge amounts of memory, allocate all at once, e.g. use `vector::reserve`. The benefit of `reserve` over `resize` is that the behavior of `back` and `push_back` still makes sense instead of maintaining the current bound manually. – pmr Nov 23 '12 at 22:47
  • 1
    You can do some tests where you just store arbitrary values in your vector to see how much it can hold so you won't spend as much time by actually finding the primes. – Vaughn Cato Nov 23 '12 at 22:48
  • @pmr, I store the primes in the vector, so I can't allocate all at once, since I don't know how many primes will be between a and b, but reserve is still a good idea! – user1792131 Nov 23 '12 at 22:56
  • 64bit system, but how do you compile? The default of Visual Studio is 32-bit compilation! – CygnusX1 Nov 23 '12 at 23:01
  • [Note that the sieve of Erastothenes that you seem to be using is not the most efficient prime-finding algorithm.](http://en.wikipedia.org/wiki/Sieve_of_Atkin) Might be of interest if you're calculating such large amounts of them. – leftaroundabout Nov 23 '12 at 23:07
  • @CygnusX1 I think it's 64bit, because I can use numbers bigger than 2^32-1, and ~780mb is far less than 2 or 4GB. I use Visual studio 2012 to compile. And I've compiled in release mode. – user1792131 Nov 23 '12 at 23:08
  • 1
    You do not need to store that many numbers. Becuase it is enough to check up to square root of N, primes up to 1e9 would be enough for every american. – Pavel Radzivilovsky Nov 23 '12 at 23:08
  • @leftaroundabout , I've heard of the sieve of atkin, but I'm a beginner, so I wanted to stat with something easy. I don't use the sieve of eratosthenes, I use a simple algorithm that checks every (second) number if it is prime or not (it's not a sieve), it just stores the primes in an array so it won't have to divide by every number (and it is a lot faster (still slow) than dividng by every number). – user1792131 Nov 23 '12 at 23:16
  • 1
    @PavelRadzivilovsky it's a brilliant idea, thanks! – user1792131 Nov 23 '12 at 23:17
  • ___int64 is 64 bit with either 64 bit or 32 bit build by definition, and will work with 32bit CPU. reallocate a few times and you're going to get memory fragmentation with 32bit exe, so being out of address space at 700Mb or less is quite possible. For your reference, 32 bit exe, the DLL's load in the process address space leaving you with 1.4 - 1.5GB if all allocated with no resizing. Do 64bit compilation and all these problems are going away. – camelccc Nov 24 '12 at 00:01

3 Answers3

3

Well, available memory is not the same as usable memory.

cppreference says that a vector's elements are stored contiguously. Therefore, it's not a question of how much RAM you have, but how much contiguous RAM you had at the time the program was running. If a vector must be contiguous, then the vector can not expand beyond the largest available section of contiguous memory.

As a result you may also run into problems when the vector requires resizing. If the vector is adjacent to a large block of contiguous memory it can expand into, then great; however, if the vector is in too small a block, then it must copy itself to a new location... this takes time.

The deque container may be a better choice as it does not require contiguous storage locations. This allows you to utilize more of your available RAM and avoids costly copy operations during resizing.

(If you're worried about whether the standard guarantees that a vector is contiguous (thus potentially leading to the OP's problems), reading this and this may help shed light on the matter.)

Community
  • 1
  • 1
Richard
  • 56,349
  • 34
  • 180
  • 251
1

There are only 4730 primes less than the square root of 2084058601. If you are running out of memory while storing that small amount of data, you're doing something wrong.

I recently wrote a prime generator and used it to sum the first billion primes. It uses a segmented Sieve of Eratosthenes, and adds additional sieving primes at the beginning of each segment. Memory use is constant for the size of the bit array and grows slowly for the sieving primes. You might want to take a look at it.

user448810
  • 17,381
  • 4
  • 34
  • 59
0

Honestly I doubt that is your problem. On a 64 bit OS you should have plenty of address space, so you should only run into problems when you run out of swap.

I would guess your problem is a fencepost problem. The fact you even think you need to resize, chain push_back should be more than enough.

The details of your crash are probably worth looking at. You might also want to look at what exceptions std vector throws when it cannot allocate memory. Another concern is that io can be much slower than computation, and if you know to the last digit where you crashed and not how you crashed maybe you are doing too much printing.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524