0

I am attempting to create a text file that contains 1 billion randomly-ordered non-repeating numbers. I have created the following code but I run out of memory long before completing (heap full). I am looking for suggestions or code corrections on how I can go about creating this txt file.

private int maxSize = 1000000000;
private int minimum = 1;
try {
        ArrayList<Integer> arrayList = new ArrayList<Integer>();
        for(int i = minimum - 1; i < maxSize; i++){
            arrayList.add( i);
}
numlist.close();


// shuffle 10 times for true mix up

for(int j = 0; j < 10; j++){
            Collections.shuffle(arrayList);
}

BufferedWriter numlist = new BufferedWriter(new FileWriter("randomNumbersNoRepeats.txt"));
for(int i = minimum - 1; i < maxSize; i++){
    System.out.println(i);
    numlist.write(i + ",");
}

numlist.close();
} catch (Exception e) {
     System.out.println("Error in creating writer new bufferWriter"
        + " for randomNumbersNoRepeats.txt");
     }
user2297683
  • 406
  • 1
  • 5
  • 15
  • 3
    There's probably better ways than creating an ArrayList with 1bn entries. How about a random access file; write 1-1bn, and then shuffle it, avoiding loading the whole thing into memory. – Steve Smith Mar 08 '17 at 14:56
  • 2
    With a good random number generator, shuffling one time should be no different from (probably even better than) shuffling 10 times. – kennytm Mar 08 '17 at 15:00
  • Use an `int[]`; reimplement `Collections.shuffle` for `int[]`. – Andy Turner Mar 08 '17 at 15:05
  • @AndyTurner, can’t the asker just use `Collections.shuffle(Arrays.asList(hisOrHerIntArray))`? Provided that the array fits in memory, of course. – Ole V.V. Mar 08 '17 at 15:16
  • @OleV.V. no, because `Arrays.asList(int[])` has type `List`. The point is that `int` is smaller than `Integer`, not that `Integer[]` is smaller than `ArrayList` (because it isn't, appreciably). – Andy Turner Mar 08 '17 at 15:17
  • @kennytm, not better, but you are right, nor poorer either. No point in shuffling more than once. – Ole V.V. Mar 08 '17 at 15:17
  • Of course, @AndyTurner. No way to avoid a different implementation. Only requires a few lines of code, though. – Ole V.V. Mar 08 '17 at 15:19

3 Answers3

3

Declare:

private static final int maxSize = 1_000_000_000;
static int[] array = new int[maxSize];

Fill the array with non-repeating numbers, for example array[i] = i + 1;.

Write a shuffling method that uses Fisher–Yates shuffle algorithm. This does not take many lines of code, so I don’t think I need to do it for you.

Run your program with something like -Xmx10G VM argument. This will make sure that enough heap space for the array is allocated.

Thanks go to Andy Turner for the inspiration. Edit: Andy continues to inspire: To initialize and shuffle the array in one go, you may use the inside-out algorithm described in the same Wikipedia article. In Java:

    Random r = new Random();
    for (int i = 0; i < maxSize; i++) {
        int j = r.nextInt(i + 1);
        if (j != i) {
            array[i] = array[j];
        }
        array[j] = i + 1;
    }
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
  • 1
    Perhaps worth noting that the "inside out" method, described on that Wikipedia article, would be perfect here. – Andy Turner Mar 08 '17 at 17:26
0

Mabye you could extend the Stack class to create a stack that pops at a random location.

Below is a example I put together:

public class ExtendedStack<E> extends Stack
{

    public static void main(String[] args)
    {
        ExtendedStack stack = new ExtendedStack<Integer>();

        for (int i = 0; i < 10; i++)
        {
            stack.push(i);
        }

        Integer random = (Integer) stack.popRandom();
        System.out.println(random);
    }

    public synchronized E popRandom()
    {
        E obj;
        int len = size();
        int randomLocation = randomLocation(len);

        obj = (E) elementAt(randomLocation);
        removeElementAt(randomLocation);

        return obj;
    }

    private int randomLocation(int len)
    {
        return new Random().nextInt(len + 1);
    }

}

For more information on the stack in Java:

https://docs.oracle.com/javase/7/docs/api/java/util/Stack.html

semywow
  • 66
  • 2
0

One thing to know about ArrayList is that it automatically increases its allocated space by 50% every time it runs out of space when the add() function is called. If you allow it to grow this way, you will end up with a large amount of unused allocated space. In this example, it will run out of space at 839k elements increase by 50% to 1.26 billion elements. That space for an additional 0.26 billion entries is wasted! So you could use about 20% less memory by declaring your array capacity at the time of creation with

ArrayList<Integer> arrayList = new ArrayList<Integer>(1000000000);

Having said that, you still have a billion ints which is going to use something like 3GB of memory. That's a large heap. Check out this thread on how to increase your heap size. Then try only creating arraylist with its declared capacity to make sure you now have enough memory available for its full size.

Lastly you could remove entries from the ArrayList after they are written to free up space as you go

for(int i = minimum - 1; i < maxSize; i++){
System.out.println(arraylist.get(0));
numlist.write(arraylist.get(0) + ",");
arraylist.remove(0);
}

(btw your original code writes to file the iteration variable and not the values stored in the ArrayList.) I don't think buffered writer should use up too much of your memory but this step may be useful if you have additional operations after this piece of code.

Community
  • 1
  • 1
James
  • 1,180
  • 1
  • 7
  • 8