5

For one of my applications I need to generate vector of size 2^35 (the size of my RAM is 96 GB, so this vector can easily fit into RAM).

int main ()
{
  int i;

  /* initialize random seed: */
  srand (time(NULL));

  vector<int> vec;
  do {
     i = rand() % 10 + 1;
     vec.push_back(i);
  } while ((vec.size()*sizeof(int))<pow(2,35));

  return 0;
}

However, I notice that my do while loop executes infinitely. One of the possible reasons is range of vec.size() is long unsigned int, which is very less than the number of elements inserted i.e. pow(2,35), due to which I think it goes in an infinite loop. I may be wrong. Please correct me if I am wrong. But can someone please tell how can I insert greater than pow(2,35) numbers in vec.

gcc version:4.8.2

Torsten Marek
  • 83,780
  • 21
  • 91
  • 98
Steg Verner
  • 893
  • 2
  • 12
  • 28
  • 1
    Are you sure it's going to infinity and not just taking very long? You could split that vector into smaller ones if `vec.size()` is really the problem. – Lord Zsolt Apr 23 '15 at 11:38
  • Is `sizeof(size_t)` on your machine 4? Otherwise, I'd expect it to take a while to insert a few billion elements... – Barry Apr 23 '15 at 11:39
  • 7
    Better to reserve space in that case. – Jarod42 Apr 23 '15 at 11:39
  • 1
    pow(2,35) should be a double, so it doesn't really make sense. vec is probably doing a lot of reallocating, try allocating the entire vector before pushing back elements. – Jaciq Apr 23 '15 at 11:39
  • @LordZsolt I am afraid I can not split. My application requires the entire vector (in one chunk) to be stored in RAM (it is possible given its size). Also why would inserting a lot of values in vector take too long...I am a little curious about this? – Steg Verner Apr 23 '15 at 11:40
  • 2
    vector is guaranteed to be continously allocated in memory, so when it's grown beyond the current bounds it has to be reallocated and this process is then repeated lots of times, the vector growing each time. – dutt Apr 23 '15 at 11:41
  • @Jarod42 How can I reserve space? – Steg Verner Apr 23 '15 at 11:41
  • If you know the number of elements in advance you should propably use [reserve](http://en.cppreference.com/w/cpp/container/vector/reserve) to avoid reallocating. And what is the value of std::numeric_limits::max()? – Philipp Lenk Apr 23 '15 at 11:42
  • 2
    @StegVerner: `vec.reserve(n)`. – Jarod42 Apr 23 '15 at 11:42
  • @dutt Ok how can I allocate it a very large chunk say of size 2^34 at once..so that it does not seem to be in an infinite loop..please guide – Steg Verner Apr 23 '15 at 11:42
  • 2
    Based on what you said in your description, you need a vector of size 2^35, and 2^35 = 2^3 * 2^32 ~= 8 * 4 billions = 32 billions. So you want 32 billions entries. If each is a 4-byte int, you need around 128 GB of RAM. But in your code you set 2^35 as the limit of memory you want to use, already taking into account that an entry takes 4 bytes and not 1. Which one is right? – Fabio says Reinstate Monica Apr 23 '15 at 11:43
  • @StegVerner use the reserve call as previously mentioned by others, then you don't need reallocation. – dutt Apr 23 '15 at 11:45
  • @PhilippLenk Its 1.18973e+4932 – Steg Verner Apr 23 '15 at 11:45
  • @FabioTurati Even if I set limit to 2^34 still the code is taking infinite time – Steg Verner Apr 23 '15 at 11:47
  • Everything takes time. If you break it down to assembly level, you're looking at 5 operations (without counting operations for MOD, rand() and push_back()). At a 3.5Ghz processor, that's one minute. Now add to that rand() which is at least 10 operations long. That's nearly 3 minutes. You also got the MOD. You can also add another 5-10 operations for the loop condition. We're at 6 minutes already and I haven't even considered what push_back is doing. – Lord Zsolt Apr 23 '15 at 11:49
  • 1
    You could check for the max size `vec.max_size()` – xuma202 Apr 23 '15 at 11:49
  • What is `sizeof(vec.size())`? If the answer is `4`, then that is 32-bits, and your attempt to multiple `vec.size()*sizeof(int)` will be constrained by 32-bits and can therefore never be greater than 2^35. You may need to force your program to 64-bits. `-m64` to g++ if I remember correctly. – Aaron McDaid Apr 23 '15 at 11:49
  • 3
    @AaronMcDaid Value of sizeof(vec.size()) is 8 – Steg Verner Apr 23 '15 at 11:52
  • If size is fixed, using a regular array may buy you some speed. It will still take some time, though. On multiple cores, you can also divide the work in threads to improve performance. – Not a real meerkat Apr 23 '15 at 11:58
  • 13
    "the vector can easily fit" - no it can't. Assuming 4 bytes per `int`, the final array will be 2^37 bytes, or 128GB. Then you multiply by `sizeof(int)` for some mysterious reason, so you're actually trying to allocate 512GB. By not reserving enough capacity first, you're might need twice as much memory due to fragmentation. So it should easily fit into 1TB, but not 96GB. – Mike Seymour Apr 23 '15 at 12:01
  • 1
    @MikeSeymour Actually, the loop breaks when `vec.size() * sizeof(int) >= 2^35` <=> `vec.size() >= 2^35 / sizeof(int) = 2^33` (most likely), so he only needs `2^33 * sizeof(int) B = 2^35 B` (i.e. less than 34GB). – Baum mit Augen Apr 23 '15 at 12:34
  • 2
    Have you tried using `reserve()`, as others have said? And also, have you tried profiling a little? Which means: add a counter, increment it every time you complete a `push_back()`, and print something every 100 million times (or, go all the way and print a timestamp). Then, just check the rate at which you see each of these messages. I expect they will be quite fast at first, but then they will slow down. This would be without reserving. If, instead, you `reserve()`, I expect the rate will stay the same, but the problem might be that you can't get such a huge contiguous block of memory. – Fabio says Reinstate Monica Apr 23 '15 at 12:34
  • Have you tried putting an output statement (like writing the number of iterations done every 1e5 iterations or so to `std::cerr`) to determine if you have an actual infinite loop or if it just takes very long? 2^33 is a pretty big number. – Baum mit Augen Apr 23 '15 at 13:01
  • Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackoverflow.com/rooms/76065/discussion-on-question-by-steg-verner-storing-many-elements-in-stdvector-c). – Taryn Apr 23 '15 at 13:27
  • @FabioTurati Thanks a lot. How can I find out whether reserve() has allocated the amount of memory it have asked it to allocate i.e. 34 GB – Steg Verner Apr 23 '15 at 13:31
  • 2
    @StegVerner: What do you mean? It will have done. Because you asked it to and that's what it does. – Lightness Races in Orbit Apr 23 '15 at 14:24

3 Answers3

3

I'll try to address some of your problems in a simple solution:

First problem you have is space. Since you need numbers from 1-10 only, a int8_t would serve you much better.

Second is speed. std::vector does a lot of allocations and reallocations behind the hood. Since you have a fixed size, In my opinion there's no need to use it. Knowing this, we'll use a simple array and threads to improve performance.

Here's the code:

#include <array>
#include <random>
#include <thread>
#include <cstdint>
#include <memory>
#include <chrono>

// Since you only need numbers from 1-10, a single byte will work nicely.
const uint64_t size = UINT64_C(0x800000000); // Exactly 2^35
typedef std::array<int8_t, size> vec_t;

// start is first element, end is one-past the last. This is a template so we can generate multiple functions.
template<unsigned s>
void fill(vec_t::iterator start, vec_t::iterator end) {
    static const int seed = std::chrono::system_clock::now().time_since_epoch().count()*(s+1);
    static std::default_random_engine generator(seed);
    static std::uniform_int_distribution<int8_t> distribution(1,10);
    for(auto it = start; it != end; ++it) {
        *it = distribution(generator);  // generates number in the range 1..10
    }
}

int main() {
    auto vec = std::unique_ptr<vec_t>(new vec_t());

    // Each will have its own generator and distribution.
    std::thread a(fill<0>, vec->begin(), vec->begin() + size/4);
    std::thread b(fill<1>, vec->begin() + size/4, vec->begin() + size/2);
    std::thread c(fill<2>, vec->begin() + size/2, vec->begin() + (size/4)*3);
    std::thread d(fill<3>, vec->begin() + (size/4)*3, vec->end());
    a.join();
    b.join();
    c.join();
    d.join();
    return 0;
}
Not a real meerkat
  • 5,604
  • 1
  • 24
  • 55
  • `int8_t vec[size];` This won't work (on most systems). You usually cannot get more than 8 MB this way. Using `std::vector` is the right thing to do, if you `reserve` the space before you start filling it, the number of allocations is exactly one (which is smaller than *"a lot"*). – Baum mit Augen Apr 23 '15 at 12:47
  • using a heap allocated `std::array` (+`scoped_ptr`) would be best imho. `reserve` does not work that well with threads.. as `push_back` is not thread safe. – smerlin Apr 23 '15 at 12:54
  • Prefer std::array to c-style array. – evan Apr 23 '15 at 12:54
  • ***No*** array is needed or appropriate here, neither the `arr[]` nor the `std::array` nor the `new[]` style. `std::vector` itself is not the problem here. – Baum mit Augen Apr 23 '15 at 12:57
  • @BaummitAugen: `vector` is not optimal. You cannot use push_back in threads. You would need to set the size before you use the threads. but in doing that you would need to specify values for the `vector` elements, which would cause unnecessary `memset` calls. – smerlin Apr 23 '15 at 13:01
  • @smerlin Alright, this might be correct. You could try things like [this](http://stackoverflow.com/questions/18669296/c-openmp-parallel-for-loop-alternatives-to-stdvector), but yeah, one would have to measure the overhead. You solution might be good if you use parallel code. Maybe one should measure how long `std::vector::resize` takes to. I don't know. – Baum mit Augen Apr 23 '15 at 13:09
  • 1
    I edited the code to use the heap. If you allocate a `std:array` on the stack, its elements will be allocated on the stack aswell. So now the code allocates the `std::array` on the heap instead. – smerlin Apr 23 '15 at 13:23
  • @smerlin that indeed looks much better. I was actually wondering why std::array couldn't use a custom allocator. It doesn't need one, of course :D. – Not a real meerkat Apr 23 '15 at 13:31
  • Oops last comment was nonsense. You should seed you generators though. – Baum mit Augen Apr 23 '15 at 16:00
  • @CássioRenan I think [`std::random_device`](http://en.cppreference.com/w/cpp/numeric/random/random_device) would be a better seed. Mersenne twister engines for example take quite a while to diverge for similar seeds. – Baum mit Augen Apr 23 '15 at 22:54
2

Why can't you use constructor?

std::vector<int> vec ( number_of_elements );

That way you'll have memory reserved, then you can randomize elements using generate or something.

Dino
  • 599
  • 1
  • 9
  • 20
  • 1
    I believe it's faster to use vec.reserve and then push_back/emplace_back. As you're going to have to modify every existing value anyway (loop through the vector all over again), since the ctor default constructs the type (unless you pass it a value as the second arg, but this doesn't make a difference). – miguel.martin Apr 23 '15 at 13:24
  • @miguel.martin How can I find out whether reserve() has allocated the amount of memory it have asked it to allocate i.e. 34 GB – Steg Verner Apr 23 '15 at 13:41
  • How can I find out whether the constructor has allocated the amount of memory it have asked it to allocate i.e. 34 GB – Steg Verner Apr 23 '15 at 13:42
  • 1
    @StegVerner I pretty sure it would throw exception, if the constructor from my answer fails. You can call 'capacity' to get the number of elements that vector can store without reallocation. – Dino Apr 23 '15 at 14:13
  • One thing to keep in mind with `std::vector::reserve` is that it might try to allocate more memory then you told it to. If you try to reserve e.g. 34GiB and have 40GiB memory available, it could still fail, because under the hood `std::vector` might try to allocate `48GiB` of memory. Therefore `std::vector` is not the best tool if you are limited by memory. – smerlin Apr 23 '15 at 14:41
1

Update

As Baum mit Augen has highlighted, this post doesn't really answer the question because in his platform condition 4 doesn't hold (sizeof(std::size_t) is actually 8). However, I leave this post here to highlight an issue that might occur when porting the code.

Original post

One problem that I see is the following. Let's assume (most platforms fulfill these assumptions) that

1) vec.size returns std::size_t (not guaranteed);

2) sizeof returns std::size_t (guaranteed);

3) std::size_t is an unsigned integer type (guaranteed);

4) sizeof(std::size_t) == 4 (not guaranteed);

5) CHAR_BIT == 8 (not guaranteed).

(Recall that CHAR_BIT is the number of bits in a char.)

Therefore, the type of vec.size()*sizeof(int) is std::size_t and its maximum value is 2^(sizeof(std::size_t)*CHAR_BIT) - 1 == 2^32 - 1 < 2^32 < 2^35. Therefore, vec.size()*sizeof(int) is always smaller than 2^35.

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
Cassio Neri
  • 19,583
  • 7
  • 46
  • 68
  • http://stackoverflow.com/questions/29822249/storing-many-elements-in-stdvector-c#comment47771391_29822249 The size of the integer type that `vector::size` returns is 8 on his system. – Baum mit Augen Apr 23 '15 at 12:39
  • @BaummitAugen Fair enough, I haven't read all the comments (it should probably be made clear be in the post). In any case, I believe my answer is still relevant because it highlights issues one may encounter when trying to write portable code. – Cassio Neri Apr 23 '15 at 12:45