6

I'm doing some Project Euler exercises and I've run into a scenario where I have want arrays which are larger than 2,147,483,647 (the upper limit of int in C#).

Sure these are large arrays, but for instance, I can't do this

// fails
bool[] BigArray = new BigArray[2147483648];

// also fails, cannot convert uint to int
ArrayList BigArrayList = new ArrayList(2147483648); 

So, can I have bigger arrays?

EDIT: It was for a Sieve of Atkin, you know, so I just wanted a really big one :D

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Ian G
  • 29,468
  • 21
  • 78
  • 92
  • 1
    If you are trying to create such an array to solve a project Euler problem, then I think you've chosen a poor solution strategy for the problem. (Dunno if can create bigger arrays on x64 though, hopefully someone will give real answer to your .Net question.) – Brian Feb 21 '09 at 20:34
  • Yes, I know that is the case (re:strategy) but I was just shocked when I hit the limit! – Ian G Feb 21 '09 at 20:36
  • I asked same question before, can not get complete answer, hope you get an answer to overcome this problem.. http://stackoverflow.com/questions/494923/numbers-that-exceeds-basic-types-in-c – Canavar Feb 21 '09 at 20:36
  • @ScarletGarden: thanks. annoying isn't it :D – Ian G Feb 21 '09 at 20:39
  • for the record, i did some really small tweaks and the solution comes in around 6 secs, it was problem 187 – Ian G Feb 22 '09 at 00:29
  • The "uint to int" error is because you're initialising with int.MaxValue+1, rather than int.MaxValue. – Zooba Feb 22 '09 at 01:14
  • 1
    Why are you trying to use ArrayList? System.Collections.Generic.List is much better in this context, especially that you won't need boxing. – Hosam Aly Mar 10 '09 at 08:59
  • @Hosam, yes thanks, I know that *now* :D – Ian G Mar 10 '09 at 11:53

6 Answers6

12

Anytime you are working with an array this big, you should probably try to find a better solution to the problem. But that being said I'll still attempt to answer your question.

As mentioned in this article there is a 2 GB limit on any object in .Net. For all x86, x64 and IA64.

As with 32-bit Windows operating systems, there is a 2GB limit on the size of an object you can create while running a 64-bit managed application on a 64-bit Windows operating system.

Also if you define an array too big on the stack, you will have a stack overflow. If you define the array on the heap, it will try to allocate it all in one big continuous block. It would be better to use an ArrayList which has implicit dynamic allocation on the heap. This will not allow you to get past the 2GB, but will probably allow you to get closer to it.

I think the stack size limit will be bigger only if you are using an x64 or IA64 architecture and operating system. Using x64 or IA64 you will have 64-bit allocatable memory instead of 32-bit.

If you are not able to allocate the array list all at once, you can probably allocate it in parts.

Using an array list and adding 1 object at a time on an x64 Windows 2008 machine with 6GB of RAM, the most I can get the ArrayList to is size: 134217728. So I really think you have to find a better solution to your problem that does not use as much memory. Perhaps writing to a file instead of using RAM.

Brian R. Bondy
  • 339,232
  • 124
  • 596
  • 636
  • but I can't do this: ArrayList BigArrayList = new ArrayList(2147483648); either – Ian G Feb 21 '09 at 20:37
  • "stack overflow": I understand the array being on the stack if it's a local variable, but are you saying that the **contents** of an array are allocated on the stack as well (instead of on the heap)? – ChrisW Feb 21 '09 at 21:05
  • I agree. This would be a heap limitation, not stack. – recursive Feb 21 '09 at 21:07
  • I've allocated 2.5 GB in one block on Linux x86. – Joshua Feb 21 '09 at 21:26
  • ChrisW, I agree, I clarified. – Brian R. Bondy Feb 21 '09 at 21:29
  • Joshua: Linux is not Windows, and it is not .NET. ;) This answer only says that 1) 32-bit Windows imposes a 2GB limit on 32-bit processes by default, and 2) Even in 64-bit, .NET imposes a 2GB limit on any single object. Linux doesn't limit 32-bit processes to 2GB. – jalf Feb 21 '09 at 23:52
8

The array limit is, afaik, fixed as int32 even on 64-bit. There is a cap on the maximum size of a single object. However, you could have a nice big jagged array quite easily.

Worse; because references are larger in x64, for ref-type arrays you actually get less elements in a single array.

See here:

I’ve received a number of queries as to why the 64-bit version of the 2.0 .Net runtime still has array maximum sizes limited to 2GB. Given that it seems to be a hot topic of late I figured a little background and a discussion of the options to get around this limitation was in order.

First some background; in the 2.0 version of the .Net runtime (CLR) we made a conscious design decision to keep the maximum object size allowed in the GC Heap at 2GB, even on the 64-bit version of the runtime. This is the same as the current 1.1 implementation of the 32-bit CLR, however you would be hard pressed to actually manage to allocate a 2GB object on the 32-bit CLR because the virtual address space is simply too fragmented to realistically find a 2GB hole. Generally people aren’t particularly concerned with creating types that would be >2GB when instantiated (or anywhere close), however since arrays are just a special kind of managed type which are created within the managed heap they also suffer from this limitation.


It should be noted that in .NET 4.5 the memory size limit is optionally removed by the gcAllowVeryLargeObjects flag, however, this doesn't change the maximum dimension size. The key point is that if you have arrays of a custom type, or multi-dimension arrays, then you can now go beyond 2GB in memory size.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Very interesting, if this is true, I wonder what the justification is for the existence of the Array.LongLength property. – Tamas Czinege Feb 21 '09 at 20:50
  • It is presumably needed to get elements between 1gb and 2gb (assuming byte[]) since int is signed, and they didn't want to use uint due to CLS compliance. – Marc Gravell Feb 21 '09 at 21:11
6

You don't need an array that large at all.

When your method runs into resource problems, don't just look at how to expand the resources, look at the method also. :)

Here's a class that uses a 3 MB buffer to calculate primes using the sieve of Eratosthenes. The class keeps track of how far you have calculated primes, and when the range needs to be expanded it creates a buffer to test another 3 million numbers.

It keeps the found prime numbers in a list, and when the range is expanded the previos primes are used to rule out numbers in the buffer.

I did some testing, and a buffer around 3 MB is most efficient.

public class Primes {

   private const int _blockSize = 3000000;

   private List<long> _primes;
   private long _next;

   public Primes() {
      _primes = new List<long>() { 2, 3, 5, 7, 11, 13, 17, 19 };
      _next = 23;
   }

   private void Expand() {
      bool[] sieve = new bool[_blockSize];
      foreach (long prime in _primes) {
         for (long i = ((_next + prime - 1L) / prime) * prime - _next;
            i < _blockSize; i += prime) {
            sieve[i] = true;
         }
      }
      for (int i = 0; i < _blockSize; i++) {
         if (!sieve[i]) {
            _primes.Add(_next);
            for (long j = i + _next; j < _blockSize; j += _next) {
               sieve[j] = true;
            }
         }
         _next++;
      }
   }

   public long this[int index] {
      get {
         if (index < 0) throw new IndexOutOfRangeException();
         while (index >= _primes.Count) {
            Expand();
         }
         return _primes[index];
      }
   }

   public bool IsPrime(long number) {
      while (_primes[_primes.Count - 1] < number) {
         Expand();
      }
      return _primes.BinarySearch(number) >= 0;
   }

}
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • Efficiency-wise, I think it would be more efficient to have your block size aligned to some power of 2 (e.g. 3 MB == 3*1024*1024), because it would make memory management a little easier for the OS (e.g. because your memory is divided evenly into pages). – Hosam Aly Mar 10 '09 at 08:56
  • 1
    Would it not be more efficient to use bit sets instead of boolean arrays? It could save much space at the very least. – Hosam Aly Mar 10 '09 at 10:54
  • @HosamAly: This is not important since we're in managed space. – mafu Mar 19 '12 at 16:31
  • 1
    @HosamAly: Saving space is not an issue (a few megs is nothing) but the access cost of bit arrays is pretty huge. Keep it simple really applies here. – mafu Mar 19 '12 at 16:34
3

I believe that even within a 64 bit CLR, there's a limit of 2GB (or possibly 1GB - I can't remember exactly) per object. That would prevent you from creating a larger array. The fact that Array.CreateInstance only takes Int32 arguments for sizes is suggestive too.

On a broader note, I suspect that if you need arrays that large you should really change how you're approaching the problem.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • nice, I was hoping I'ld get a response from you :D – Ian G Feb 21 '09 at 20:40
  • In one question you need to get primes up to 50 billion, but the effective way is to use The Sieve of Eratosthenes which forces you to declare an array with such index.. http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes – Canavar Feb 21 '09 at 20:43
  • I would argue that at that point it *isn't* an effective way. – Jon Skeet Feb 21 '09 at 22:57
1

I'm very much a newbie with C# (i.e. learning it this week), so I'm not sure of the exact details of how ArrayList is implemented. However, I would guess that as you haven't defined a type for the ArrayList example, then the array would be allocated as an array of object references. This might well mean that you are actually allocating 4-8Gb of memory depending on the architecture.

  • Good point, booleans take up 4 bytes in .NET and, so, 2 GB of booleans is 8 GB total. The ArrayList class is implemented as an array internally which re-allocates a new (larger) array as needed to accommodate larger sizes: http://msdn.microsoft.com/en-us/library/system.collections.arraylist.aspx – Mike Rosenblum Feb 22 '09 at 02:22
  • 1
    Actually, it uses a lot more than that. In a bool array each bool only uses one byte, but in an ArrayList each bool uses 16 bytes. Each reference is 4 bytes, each object boxing a bool has two interal pointers and 4 bytes for the bool. So an ArrayList with 2 million booleans uses 32 GB of memory. – Guffa Feb 22 '09 at 04:27
  • @Guffa - or worse *again* on x64, since references are bigger ;-p – Marc Gravell Feb 22 '09 at 10:03
  • @Mark - correct, I just wanted to keep it simple, and comment space is limited. :) – Guffa Feb 25 '09 at 17:19
  • 1
    I've used Java for years, so I thought this might be the case. Nice to have my suspicion clarified. Abstraction layers are very useful, but sometimes we do need to know the implementation details :-) – Jason Waring Feb 26 '09 at 22:31
0

According to MSDN, the index for array of bytes cannot be greater than 2147483591. For .NET prior to 4.5 it also was a memory limit for an array. In .NET 4.5 this maximum is the same, but for other types it can be up to 2146435071.

This is the code for illustration:

    static void Main(string[] args)
    {
        // -----------------------------------------------
        // Pre .NET 4.5 or gcAllowVeryLargeObjects unset
        const int twoGig = 2147483591; // magic number from .NET

        var type = typeof(int);          // type to use
        var size = Marshal.SizeOf(type); // type size
        var num = twoGig / size;         // max element count

        var arr20 = Array.CreateInstance(type, num);
        var arr21 = new byte[num];

        // -----------------------------------------------
        // .NET 4.5 with x64 and gcAllowVeryLargeObjects set
        var arr451 = new byte[2147483591];
        var arr452 = Array.CreateInstance(typeof(int), 2146435071);
        var arr453 = new byte[2146435071]; // another magic number

        return;
    }
Anton K
  • 4,658
  • 2
  • 47
  • 60