1

I want to create a huge array of ints 128 x 18,000,000, which is largest than the max array /vector size of C++ as presented in Is there a max array length limit in C++?.

One conceptual way is to store a pair of ints as a long where the first 32 bits is the first integer and the last 32 bits is the second integer. The question is how do I do that? Since, each time I am going to use both the integers comprising the long, I need something that would be fast and efficient. How do I take one long and split it in the two or how do I store 2 ints as a long?

Original allocation:

int nof1=64;
int nof2=18000000;
int *hugeArray;
int size = 2 * nof1 * nof2;
hugeArray = new int[size];

I have 16Gb of Ram and a 64bit Ubuntu 12.04 with gcc. So, main memory is not an issue. Still, I also have access to a 32GB PC with the same OS, so there is no need to worry about RAM.

Any suggestions would be appreciated. Thanks in advance.

Community
  • 1
  • 1
Alexandros
  • 2,160
  • 4
  • 27
  • 52
  • Do you really need to keep them all in memory simultaneously? I'm not sure your long-trick will save any space. [Here's](http://stackoverflow.com/a/7027524/645270) how you do it though. – keyser Aug 29 '13 at 18:49
  • Yes I do. The long trick just needs half the array entries, which is below the C++ limit. – Alexandros Aug 29 '13 at 18:52
  • It would be easier to sort two `int` arrays and then merge them in the end (into a `vector` even if you insist). – jxh Aug 29 '13 at 18:53
  • @AlexandrosE. You won't save any memory. There's no per entry additional memory requirement in an array. – agbinfo Aug 29 '13 at 18:54
  • It is not for saving memory. I cannot create this large array, with that kind of entries – Alexandros Aug 29 '13 at 18:55
  • I really think you should think about why you need them all in an array. Other than that, try the long thing. – keyser Aug 29 '13 at 18:56
  • When I am up to 64x18,000,000 having one huge array instead of 64 arrays of 18,000,000 or vice versa is much much faster. – Alexandros Aug 29 '13 at 18:57
  • Did you just try dynamically allocating a regular array, and pass in pointers to `std::sort()`? The standard doesn't place any limitations on the value of `(last - first)`. – jxh Aug 29 '13 at 19:01
  • @jxh: Check again, the standard places the similar limits on `(last-first)` as it does on `vector::size_type` – Mooing Duck Aug 29 '13 at 19:06
  • @MooingDuck: So, does *integer type* mean `int`? – jxh Aug 29 '13 at 20:55
  • @jxh: `vector::size_type` is guaranteed to hold any non-negative value of `vector::difference_type`, which is identical to `vector::iterator::difference_type`, which defaults to `std::allocator::difference_type`, which is `ptrdiff_t`, which is "the difference of [two pointers to elements of the same array object]". – Mooing Duck Aug 29 '13 at 21:39
  • @Jxh: rereading the comments, I think I misunderstood the context of your previous comment. I don't think we actually disagree on anything here. – Mooing Duck Aug 29 '13 at 21:44

2 Answers2

2

You do realize that there's limits on the size of int right? Namely, (on many machines) it holds values between -2147483647 and 214748364 (-2.1B to 2.1B). And 2*64*18000000 is 2304000000 (2.3B), which is too large. So that value is probably being silently truncated to ~156516352 due to the undefined behavior of signed integer overflow. This is the only problem I see with what you are trying to do. To hold that size in memory, you'll have to use a different type, I suggest size_t, which is designed to hold sizes of objects in memory (convenient eh?), and you'll have to be certain to use a 64 bit build.

After you get that, hugeArray = new int[size]; can still fail, depending on the limits of your operating system and hardware. If that happens, you must redesign your program to use less memory, period.

Also, int nof2=18,000,000; is incorrect, that creates the number 18, and discards it. Creates the octal number zero, and discards it. Then it creates another octal number zero, and assigns that to nof2. Don't put commas in numbers when coding C++.

size_t nof1 = 64;
size_t nof2 = 18000000;
size_t size = 2 * nof1 * nof2;
std::vector<int> hugeArray(size);

To the origional question: "128 x 18,000,000, which is largest than the max array /vector size of C++"; That assumption is false. The first answer in the question you linked to reads: "The... limit... is set by the restrictions of the size type used to describe an index in the array". On your machine, that is WAAAAAAY bigger than INT_MAX. The answer never claims that there is a limit of INT_MAX. The only limit is size_t, which has the same limits as the hardware. The only time size_t isn't big enough is if the CPU couldn't handle numbers that big.

Mooing Duck
  • 64,318
  • 19
  • 100
  • 158
  • Yes. That is why by using longs I would need 64 x 18,000,000 entries which is smaller than INT_MAX. That is the point of the question. – Alexandros Aug 29 '13 at 19:18
  • @AlexandrosE.: Instead of doing it an insane way, why not do it the sane way that I suggested? – Mooing Duck Aug 29 '13 at 19:19
  • Sorry, I thought arrays had the limit and not that the problem was the data_type of the index. So, I suppose whenever I access the array I must use size_t as the index type. Right? – Alexandros Aug 29 '13 at 19:29
  • It does not work with arrays. I get get std::bad_alloc. I do not know about vector. Replacing my array with vector is an issue. I will see what happens. – Alexandros Aug 29 '13 at 20:03
  • @AlexandrosE.: If you're getting `std::bad_alloc`, and you're compiling as 64 bit mode, then you're hosed. That means the _operating system_ cannot handle that much memory in one block, and there's nothing you can do about it. You'll have to find a way to do this using less memory. Sorry. – Mooing Duck Aug 29 '13 at 20:23
  • Your OS may complain about a single large block -- per C rules, this has to be a *continuous* memory block. Can you split it up in 10 1,8 million long blocks, or 18 1 meg long blocks? IOW, is there a pressing reason the entire array has to fit "as is" in memory? – Jongware Aug 29 '13 at 21:57
  • I assumed he didn't have a large number of other allocations and deallocations going on before this one, but I suppose fragmentation is a potential problem. – Mooing Duck Aug 29 '13 at 22:02
  • On my 32GB PC, it works with arrays as well. So, probably fragmentation might be an issue and I should assign earlier to "reserve" this space. @Mooing Duck. I have already upvoted your answer and now I am accepting it. Thanks bro. – Alexandros Aug 30 '13 at 11:59
0

As stated in the Is there a max array length limit in C++?. You have two limits, do you have the hardware larger than 2G? The second limit is the largest integer type which is fine in your case. Because for a 32 bit int, the UINT_MAX is larger than 128 * 18, 000, 000.

CS Pei
  • 10,869
  • 1
  • 27
  • 46
  • I have 16Gb of Ram and a 64bit OS. So, no problem there. – Alexandros Aug 29 '13 at 18:54
  • There's three limits of concern are really the maximum size limit of the allocator, and the maximum size limit by the compiler, and the effective size limit by the OS on your machine. The lowest of these three is the limit on how much you can allocate. He's probably running into allocator/compiler limit. Also, MAX_INT is smaller than 128*18000000. – Mooing Duck Aug 29 '13 at 19:10
  • 18,000,000 x 128 x bytes (integer) = 8789 Mb. So there is plenty of RAM left in my linux OS. – Alexandros Aug 29 '13 at 19:13
  • @MooingDuck, you are right, INT_MAX (32bit) is smaller than 128*18000000. but UINT_MAX is large enough. – CS Pei Aug 29 '13 at 19:18