1

I want to manage the boolean state (on/off) of up to 10^18 items. What is the most memory and computationally efficient way of doing this in Java? I can't create an array of booleans this large as the primitive type of the size of an array is int, and the same goes for the BitSet class.

E.g:

long numSwitches = Long.MAX_VALUE;
boolean[] switches = new boolean[numSwitches];

gives me a compilation error: Incompatible types, Reguired: int, Found: long

JMM
  • 3,922
  • 6
  • 39
  • 46
  • You can only use `int`s as array size. If you need more, create more arrays or use a dynamic datastructure. – Turing85 Apr 08 '17 at 20:56
  • Possible duplicate of http://stackoverflow.com/questions/3040864/java-sparse-bit-vector (assuming there are not really 10^17 trues) – kennytm Apr 08 '17 at 20:56
  • 10^18 is approx. 2^60 so indeed way too much. A memory mapped file would do; a ByteBuffer or such. – Joop Eggen Apr 08 '17 at 21:01
  • Are there any patterns to the bits, eg long runs? Are they mostly on or off? – Bohemian Apr 08 '17 at 21:17
  • 1
    This looks like an XY problem. What are you really trying to achieve? Because your number of bits represents around 125 million gigabytes. It's clearly not possible to store all that state, unless maybe you're google. But then you wouldn't ask here. – JB Nizet Apr 08 '17 at 21:19
  • @JBNizet I actually did see a Senior Engineer at Google comment on someone else's post earlier today, and it'd be interesting to ask him about it – Jacob G. Apr 08 '17 at 21:20
  • 3
    Moving that amount of data around is [faster by truck than over a network](https://aws.amazon.com/snowmobile/) – Bohemian Apr 08 '17 at 21:25
  • You can only do it implicitly, encoding the bools in some other way than just storing them all. Of course, that means there must be some pattern in them, otherwise it cannot help. ZDDs can efficiently encode many interesting patterns, for example. – harold Apr 09 '17 at 10:38

1 Answers1

3

I honestly don't think something like this is possible without the use of a supercomputer. 10^18 is a very big number, it's actually relatively close to the number of atoms in a single grain of salt! If each atom is 1 bit, then the total memory will be an upwards of 125 petabytes (125,000 terabytes)!

Even if you wish to access a bit sequentially (random access will require you to load it all into memory) via the use of some sort of stream, it would still take a ridiculous amount of time.

Unless only a few bits compared to the total amount will be enabled at a time (a few billion, which would still require gigabytes of memory), I don't see any plausible way to solve this unfortunately. However, it's still very interesting to think about.

Jacob G.
  • 28,856
  • 5
  • 62
  • 116