2
    int[][] arr = new int[10000][10000];

    for (int x = 0; x < 10000; x++) {
        for (int y = 0; y < 10000; y++) {
            arr[x][y] = 5;
        }   
    } 

Obviously, I get an OutOfMemoryError, so what would be the best data structure to use to hold that amount of data (It would have to be similar to a 2d array [row x column]). I would also need to search through it and change elements. So what data structure would be the best for this scenario?

EDIT: To Clarify: -All the elements in the array HAVE to be ints. -All the elements in the array will be different values. -i dont have to use a two dimensional array...I was wondering if there is any BETTER data structure to use to store 100 million integers, rather than a two dimensional array, so that it WOULDN'T give me a OutOfMemoryError, because there has to be a better data structure out there with a good space complexity??

Adp
  • 253
  • 3
  • 11
  • 2
    Could you explain a little more about what you're trying to do, and why you need to hold so many integers in such a manner? – Tyler Jan 05 '15 at 02:13
  • 2
    you cant allocate that much memory , you would have to partially store in main memory and rest on hard disk like root of B-tree or BST in main memory and rest of the tree in hard disk, depending upon element to be searched you would move the appropriate block to main memory – advocateofnone Jan 05 '15 at 02:15
  • Sparse array http://stackoverflow.com/questions/12626135/memory-efficient-sparse-array-in-java – gknicker Jan 05 '15 at 02:18
  • @ElliottFrisch I believe it will be 10 billions – PM 77-1 Jan 05 '15 at 02:20
  • 1
    10000 * 10000 = 100 000 000 = 100 million – lockstock Jan 05 '15 at 02:21
  • 1
    Does it have to be ints? Could you use shorts or some other smaller data type? – kirbyquerby Jan 05 '15 at 02:27
  • To Clarify: -All the elements in the array HAVE to be ints. -All the elements in the array will be different values. -i dont HAVE to use a two dimensional array...I was wondering if there is any BETTER data structure to use to store 100 million integers, rather than a two dimensional array?? – Adp Jan 05 '15 at 05:07

3 Answers3

9

For in-memory storage, a 2D primitive array is about as small as you can get.

For the absolute minimum, you could use a 1D primitive array and do the indexing math yourself.

A Java int is 4 bytes long. A hundred million ints is about 400 Mb. With today's machines, you'll likely have sufficient RAM.

You do need to be sure that your JVM has sufficient heap space to hold this. You can use the command-line argument -Xmx to set the maximum amount of heap space - e.g., -Xmx768m sets the maximum heap size to 768 Mb.

Andy Thomas
  • 84,978
  • 11
  • 107
  • 151
2

If you can/want to put it all in memory, just increase your heap with -Xmx<size>

If not, you could use a memory mapped file. You should get decent performance with this approach - if the system has enough memory the file can be completely cached in memory by the OS.

int w = 1000;
int h = 1000;

Path tempFile = Files.createTempFile(null, null);
try (FileChannel fc = (FileChannel)Files.newByteChannel(tempFile, StandardOpenOption.DELETE_ON_CLOSE, StandardOpenOption.READ, StandardOpenOption.WRITE))
{
    MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_WRITE, 0L, w * h * Integer.BYTES);
    IntBuffer buf = mbb.asIntBuffer();

    for (int x = 0; x < w; x++)
    {
        for (int y = 0; y < h; y++)
        {
            buf.put((x * w + y), 5);
        }
    }
}

This approach will work fine as long as your size is less than 2 billion elements. If you need more, you can memory map each row separately:

int w = 10000;
int h = 10000;

Path tempFile = Files.createTempFile(null, null);
try (FileChannel fc = (FileChannel)Files.newByteChannel(tempFile, StandardOpenOption.DELETE_ON_CLOSE, StandardOpenOption.READ, StandardOpenOption.WRITE))
{
    //Initialize rows
    IntBuffer[] rows = new IntBuffer[h];
    for (int r = 0; r < h; r++)
    {
        rows[r] = fc.map(FileChannel.MapMode.READ_WRITE, (long)w * r * Integer.BYTES, w * Integer.BYTES).asIntBuffer();
    }

    //Process data
    for (int x = 0; x < w; x++)
    {
        for (int y = 0; y < h; y++)
        {
            rows[y].put(x, 5);
        }
    }
}
prunge
  • 22,460
  • 3
  • 73
  • 80
1

You have two ways:

  • Set -Xmx*** (for example -Xmx512 to declare 0,5 GB, to this array you need 382 MB)
  • Use file with random access, for example:

    public static void main(String[] args) throws IOException {
        BigIntegerArray arr = new BigIntegerArray(10000*10000);
        for (int x = 0; x < 10000; x++) {
            for (int y = 0; y < 10000; y++) {
                arr.set(x*10000 + y, 5);
                //    if( arr.get(x*10000 + y) == 5)
                //        System.out.println(":D");
    
            }   
        }
        arr.dispose(); // If you dispose, you cannot use this.
        arr=null;
    }
    
    public static class BigIntegerArray {
        private RandomAccessFile raf;
        private File f;
        private int size;
        public BigIntegerArray(int size) throws IOException {
            this.size=size;
            try {
                f = File.createTempFile("bigarray", ".tmp");
            } catch (IOException e) {
                f = new File("bigarray"+Long.toString(System.currentTimeMillis(),36)+"_"
                        +Long.toString(System.nanoTime(),36)+".tmp");
            }
            f.deleteOnExit();
            f.createNewFile();
            raf = new RandomAccessFile(f, "rw");
            raf.setLength(size*4);
        }
        public void set(int n, int num) throws IOException {
            if(size<=n)
                throw new ArrayIndexOutOfBoundsException("size ("+size+") <= "+n);
            raf.seek(n*4);
            raf.writeInt(num);
        }
        public int get(int n) throws IOException {
            if(size<=n)
                throw new ArrayIndexOutOfBoundsException("size ("+size+") <= "+n);
            raf.seek(n*4);
            return raf.readInt();
        }
        // Dispose (delete) a file
        public void dispose() {
            f.delete();
        }
    }
    
barwnikk
  • 950
  • 8
  • 14