1

i read in other answers that theres no limit imposed by c++ compiler maximum size of std::vector. i am trying to use vector for one purpose, and in need to have 10^19 items.

typedef struct{
  unsigned long price, weight;
}product;


//inside main
unsigned long long n = 930033404565174954;
vector<product> psorted(n);

the program breaks on the last statement. if i try resize(n) instead of initializing with n then also program breaks with message :

vector<T> too long
std::length_error at memory location

i need to sort the data accourding to price after putting in vector. what should i do ?

Community
  • 1
  • 1
Pheonix
  • 6,049
  • 6
  • 30
  • 48
  • 8
    Why are we using `typedef struct` in C++? Perhaps someone has a lot of unlearning to do? :-) – Kerrek SB Jan 21 '12 at 18:20
  • 4
    930033404565174954 is a lot bigger than 10^9. – Oliver Charlesworth Jan 21 '12 at 18:20
  • 1
    Check `sizeof(std::vector::size_type)`. That tells you the maximum size. – Kerrek SB Jan 21 '12 at 18:21
  • @OliCharlesworth sorry, typo, its 10^19.. edited. – Pheonix Jan 21 '12 at 18:22
  • @KerrekSB please point me to some guides, they dont teach much in my school. thanks – Pheonix Jan 21 '12 at 18:26
  • Your question is confusing. Are you asking about how much data you can store in a std::vector (which is what most of your question talks about) or how to sort 10 [Exa](http://en.wikipedia.org/wiki/Exa-)-items worth of data? Also, I'm curious as to where these 10^19 items are being *stored*. That's a lot of hard disks. – Nicol Bolas Jan 21 '12 at 18:28
  • 13
    [8 * 10^19 bytes ~= 7 x the estimated information content of all human knowledge](http://www.wolframalpha.com/input/?i=8+%2A+10%5E19+bytes). – R. Martinho Fernandes Jan 21 '12 at 18:30
  • 1
    @Pheonix: Do you really have 10^19 unique products? There aren't that many grains of sand on Earth, let alone enough storage on your computer. Tell us what that number means and you may get ideas for a better approach. – Drew Dormann Jan 21 '12 at 18:31
  • @DrewDormann please see my comment in Nicol's answer. thanks – Pheonix Jan 21 '12 at 18:33
  • @KerrekSB thanks Kerrek, i will definitely read it. – Pheonix Jan 21 '12 at 18:33
  • @Pheonix: A quick sanity-check should have told you that either you misunderstood the contest question or trying to brute-force a solution would be impractical. – Blastfurnace Jan 21 '12 at 18:53
  • Is it wrong that I actually laughed out loud when I saw 10^19? – Nic Foster Jan 21 '12 at 21:56
  • I see that one of your questions wasn't answered... "i read in other answers that theres no limit imposed by c++ compiler maximum size of std::vector." But that's right, actually. The _compiler_ doesn't impose limits like that: this example compiles! So the statement is correct; you only run into limits at runtime. – Mr Lister Jun 24 '12 at 11:25

2 Answers2

14

std::vector does have limits on how much stuff it can carry. You can query this with std::vector::max_size, which returns the maximum size you can use.

10^19 items.

Do you have 10^19 * sizeof(product) memory? I'm guessing that you don't have ~138 Exabytes of RAM. Plus, you'd have to be compiling in 64-bit mode to even consider allocating that much. The compiler isn't breaking; your execution is breaking for trying to allocate too much stuff.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • how should i be sorting the data then ? – Pheonix Jan 21 '12 at 18:21
  • 3
    @Pheonix: If you're asking about how to sort this data, then you should explain where you're getting it from, since you clearly have more data than you do RAM. Your question's title is misleading. – Nicol Bolas Jan 21 '12 at 18:23
  • an identity mapreduce in hadoop would do it. Even on a single box this will parallelise the sort as much as possible, potentially making the sort faster. – Tom Hennigan Jan 21 '12 at 18:27
  • "I'm guessing that you don't have ~138 Exabytes of RAM." Well, I am guessing that you don't even have 138EB of disk space. –  Jan 21 '12 at 18:28
  • @Nicol : its a contest question so i cannot ask to much in detail... price and weight are generated using a recursive equation of the form Pi = (A*Pi-1 + B) MOD C + 1 ... (i, i-1 are subscript) A, B, C are different for Price and weight. – Pheonix Jan 21 '12 at 18:31
  • 2
    @Pheonix: I'm pretty sure they expect you to use the recursive algorithm to figure out what order the things should go in from *just* the algorithm. Instead of writing them out to more RAM than actually *exists* and then trying to sort them with a quicksort. – Nicol Bolas Jan 21 '12 at 18:34
  • @Pheonix: Divide and conquer. – dalle Jan 21 '12 at 18:41
7

Others have already told you what the problem is. One possible solution is to use the STXXL library, which is an implementation of STL that's designed for huge, out-of-memory datasets.

However, 10^19 8-byte items is 80 million TB. I'm not sure anyone has a disk that large...

Also, assuming a generous disk bandwidth of 300MB/s, this would take 8000 years to write!

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680