0

I came across this question in cracking the coding interview book.

Question: Given an input file with four billion non-negative integers, provide an algorithm to generate an integer which is not contained in the file. Assume you have 1 GB of memory available for this task.

My question: Why can't we use BitSet instead of Byte[] ? Won't that simplify things ?

Code:

 long numberOflnts = ((long) Integer.MAX_VALUE) + I;
 byte[] bitfield = new byte [(int) (numberOflnts / 8)];
 void findOpenNumberQ throws FileNotFoundException {
 Scanner in = new Scanner(new FileReader("file.txt"));
 while (in.hasNextlntQ) {
 int n = in.nextlnt ();
 /* Finds the corresponding number in the bitfield by using
 * the OR operator to set the nth bit of a byte
 * (e.g., 10 would correspond to the 2nd bit of index 2 in
 * the byte array). */
 bitfield [n / 8] |= 1 « (n % 8);
 }

 for (int i = 0; i < bitfield.length; i++) {
 for (int j = 0; j < 8; j++) {
 /* Retrieves the individual bits of each byte. When 0 bit
 * is found, finds the corresponding value. */
 if ((bitfield[i] & (1 « j)) == 0) {
 System.out.println (i * 8 + j);
 return;
 }
 }
 }
}

Follow up: What if you have only 10 MB of memory? Assume that all the values are distinct.

Community
  • 1
  • 1
Walt
  • 1,426
  • 2
  • 17
  • 30
  • 2
    http://stackoverflow.com/questions/7153659/find-an-integer-not-among-four-billion-given-ones – assylias Jan 14 '15 at 10:36
  • This is a problem from "programming pearls" isnt it? – Mukul Goel Jan 14 '15 at 10:37
  • Why can't we use BitSet instead of Byte[] ? – Walt Jan 14 '15 at 10:38
  • 3
    @Walt `BitSet` is Java specific. The question is not. You're also missing the point of the question. – Kayaman Jan 14 '15 at 10:54
  • Oh ok. So, I think we can very well use BitSet. What do I do with this question now ? Delete it ? – Walt Jan 14 '15 at 11:39
  • With `util.BitSet`, bit indices are limited by 2^31, so it doesn't matter whether a billion is 10^12 or 10^9: assuming `that all the values are distinct`, not all can be used with `util.BitSet`. – greybeard Jan 14 '15 at 12:28

1 Answers1

1

The question does allow for alternative solutions. Java's BitSet can work but there are a couple of hidden traps:

  • The Java VM will need some memory. So you may run out of memory.
  • The BitSet is backed by an array. Java arrays use 32bit signed int as indexes, so you effectively have 2^31 entries. Since each is a 64bit long, that is enough.
  • When bits are added, the set grows. Eventually, the Java code needs to allocate a new array for the new bits. If you're not careful, you can run out of memory in this step. The fix is to create the array with 2^32 bits from the beginning.
Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820