25

I'm not too concerned about time efficiency (the operation will be rare), but rather about memory efficiency: Can I grow the array without temporarily having all the values twice?

Is there a more efficient way to grow a large array than creating a new one and copying over all the values? Like, concatenating it with a new one?

What about having fixed-size arrays stored in another array and reallocate / copy that top-level one? Would that leave the actual values in place?

I'm aware of ArrayList, but I need a lot of control about accessing the array and the access needs to be very fast. For instance, I think I prefer a[i] to al.get(i).

The main reason why I care about this is that the array in question (or a number of such arrays) might very well occupy a large enough portion of main memory that the usual strategy of creating a double sized copy before discarding the original might not work out. This may mean that I need to reconsider the overall strategy (or up my hardware recommendations).

Hanno Fietz
  • 30,799
  • 47
  • 148
  • 234

12 Answers12

29

The best way to have a dynamically resizing 'array' or list of items is to use an ArrayList.

Java has already built in very efficient resizing algorithms into that data structure.

But, if you must resize your own array, it is best to use System.arraycopy() or Arrays.copyOf().

Arrays.copyOf() can most simply be used like so:

int[] oldArr;
int newArr = Arrays.copyOf(oldArr, oldArr.length * 2);

This will give you a new array with the same elements as the old array, but now with room to spare.

The Arrays class in general has lots of great methods for dealing with arrays.

Also

It is important to make sure that you aren't just growing your array by one element each time an element is added. It is best to implement some strategy where you only have to resize the array every once in a while. Resizing arrays is a costly operation.

jjnguy
  • 136,852
  • 53
  • 295
  • 323
  • Well, the growing strategy would most likely be the classic "double each time". – Hanno Fietz Sep 15 '09 at 13:55
  • I amended my post to clarify that I'm not concerned about the time it takes to grow the array, but rather about the extra memory that's required by the operation. If this has to be 100% of the array size, I can't grow arrays but have to add arrays. – Hanno Fietz Sep 15 '09 at 14:00
  • +1 for a very good answer to the more sloppy first revision of my question! – Hanno Fietz Sep 15 '09 at 14:10
  • 1
    I'll keep this one here though, it still has some good information. – jjnguy Sep 15 '09 at 15:57
18

Is there a more efficient way to grow a large array than creating a new one and copying over all the values? Like, concatenating it with a new one?

No. And probably there is no language, that guarantees growing an array will always take place without copying. Once you allocate the space for the array and do something else, you most likely have other objects in memory right after the end of the array. At that point, it's fundamentally impossible to grow the array without copying it.

What about having fixed-size arrays stored in another array and reallocate / copy that top-level one? Would that leave the actual values in place?

You mean have an array of arrays and treat it as one large array consisting of a concatenation of the underlying arrays? Yes, that would work (the "faking it by doing indirection" approach), as in Java, Object[][] is simply an array of pointers to Object[] instances.

mikyra
  • 10,077
  • 1
  • 40
  • 41
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
  • Thanks for commenting on my indirection idea, that's exactly what I was wondering. – Hanno Fietz Sep 15 '09 at 14:03
  • 13
    Really minor side note on the topic of "no language does this"... C will do exactly this--resize an array without copying the data--so long as the realloc'd memory does not fill the entire block and\or no blocks lie beyond the allocated block. With a large block allocator or fastest fit allocator this is not uncommon depending on the size and alignment of your array. With best fit allocators, naturally the memory will be moved to a different block most of the time, however it still eliminates the "double copy" as the memory may be moved a segment at a time if the implementation supports it. – Lisa Mar 11 '11 at 03:33
6

Arrays are constant-size, so there is no way to grow them. You can only copy them, using System.arrayCopy to be efficient.


ArrayList does exactly what you need. It's optimized much better than any of us could do, unless you devote a considerable time to it. It uses internally System.arrayCopy.


Even more, if you have some huge phases where you need the list to grow/reduce, and others where it doesn't grow/reduce and you make thousands of read or write in it. Suppose also you have a huge performance need, that you prooved that ArrayList is too slow when read/writing. You could still use the ArrayList for one huge phase, and convert it to an array for the other. Note this would be effective only if your application phases are huge.

KLE
  • 23,689
  • 4
  • 56
  • 62
4

How about a linked list coupled with an array that holds only references.

The linked list can grow without having to allocate new memory, the array would ensure you have easy access. And every time the array becomes to small, you can simply trash the entire array and build it up again from the linked list.

alt text

Community
  • 1
  • 1
NomeN
  • 17,140
  • 7
  • 32
  • 33
  • Wouldn't that mean you have a permanent copy of all the data in memory? With the memory limitations he talks about, don't think this will help. – Yannick Motton Sep 15 '09 at 14:15
  • 1
    For my case, Yannick is right, but it's still an interesting idea, might be an inspiration for similar problems. – Hanno Fietz Sep 15 '09 at 14:17
  • Well if were to be used only as a backup to rebuild a fixed size array, you might aswell use an ArrayList :-) – Yannick Motton Sep 15 '09 at 14:18
  • @Yannick If you'd use an arraylist, don't you have a full copy in memory again when you increase the size of the ArrayList. – NomeN Sep 15 '09 at 14:24
  • @Yannick (again, 1st comment now) I don't believe I'd have a copy of all data in memory, if I'm not parsing your comment correctly or miss some fundamentals here please set me straight. – NomeN Sep 15 '09 at 14:38
  • @ NomeN You would have a full copy in memory since it increases the size with an ArrayCopy, however having a persistant copy in memory leaves me to believe that memory might not be such a problem after all – Yannick Motton Sep 15 '09 at 14:42
  • what do you mean by 'it' in "since it increases the size with an ArrayCopy". If you mean the linked list, I meant a linked list you'd write yourself. – NomeN Sep 15 '09 at 15:09
  • I mean having a linked list of all the data + an array of the same data would mean you have all the data in memory, twice. At this point memory usage stops being a consideration, and you might aswell just use an ArrayList instead of the linked list. – Yannick Motton Sep 15 '09 at 22:12
  • Aha, but thats the trick, the array does not contain the actual objects. It just keeps a reference to the object which resides in the linked list thus making the array lightweight. – NomeN Sep 15 '09 at 23:03
  • I think you are missing the fact that he has a very large array of primitives. Having a linkedlist of object references to primitives would give you for N elements in the array: N * 8 bytes for the primitives + N * 4 bytes for the object reference in the linked list + N * 4 bytes for the object reference of the next item in the chain + N * 8 bytes for the fixed size array of primitives. Long story short, you would use 3 times the memory of the fixed size array alone. – Yannick Motton Sep 16 '09 at 08:17
  • I have not seen any mention of primitives, but in that case you would be right. Although the array would only be N*4 as it are references, but that's just nitpicking at that point ;-). Thx for your patience with me though! – NomeN Sep 16 '09 at 09:12
  • I am unsure how primitives are handled in java. In .NET referring to ValueTypes would require boxing. Not a good idea in this case I believe. – Yannick Motton Sep 16 '09 at 09:14
  • It seems java does it in a similar way: http://java.sun.com/j2se/1.5.0/docs/guide/language/autoboxing.html – Yannick Motton Sep 16 '09 at 09:18
  • Interesting discussion guys..and interesting idea..but then i didnt get why we couldnt use a linkedList directly in the first place – hakish Aug 08 '12 at 04:07
3

"Can I grow the array without temporarily having all the values twice?"

Even if you copy the arrays, you're only going to have all the values once. Unless you call clone() on your values, they're passed by reference into the new array.

If you already have your values in memory, the only additional memory expense when copying into a new array is allocating the new Object[] array, which doesn't take much memory at all, as it's just a list of pointers to value objects.

Sam Barnum
  • 10,559
  • 3
  • 54
  • 60
  • 1
    It's a large array of primitives. No object references. Hence copying the array does need at least twice the size of the array allocated. – Yannick Motton Sep 15 '09 at 22:17
2

Is the array itself large, or are you referencing large ReferenceTypes?

There is a difference between an array of a PrimitiveType with billions of elements, and an array with thousands of elements, but they refer to large class instances.

int[] largeArrayWithSmallElements = new int[1000000000000];
myClass[] smallArrayWithLargeElements = new myClass[10000];

Edit:

If you have performance considerations using ArrayList, I can assure you it will perform more or less exactly as Array indexing.

And if the application has limited memory resources, you can try to play around with the initial size of the ArrayList (one of it's constructors).

For optimal memory efficiency, you could create a container class with an ArrayList of Arrays.

Something like:

class DynamicList
{
    public long BufferSize;
    public long CurrentIndex;

    ArrayList al = new ArrayList();

    public DynamicList(long bufferSize)
    {
        BufferSize = bufferSize;

        al.add(new long[BufferSize]);
    }

    public void add(long val)
    {
        long[] array;

        int arrayIndex = (int)(CurrentIndex / BufferSize);

        if (arrayIndex > al.size() - 1)
        {
            array = new long[BufferSize];
            al.add(array);
        }
        else
        {
            array = (long[])al.get(arrayIndex);
        }

        array[CurrentIndex % BufferSize] = val;
    }

    public void removeLast()
    {
        CurrentIndex--;
    }

    public long get(long index)
    {
        long[] array;

        int arrayIndex = (int)(index / BufferSize);

        if (arrayIndex < al.size())
        {
            array = (long[])al.get(arrayIndex);
        }
        else
        {
            // throw Exception
        }

        return array[index % BufferSize];
    }
}

(my java is rusty, so please bear with me...)

Yannick Motton
  • 34,761
  • 4
  • 39
  • 55
  • The array itself is large (millions), the element type is `long`. Also, there are several thousand such arrays. – Hanno Fietz Sep 15 '09 at 14:14
  • The difference being that copying the small array with large elements only takes up the space for the object references, not the objects? – Hanno Fietz Sep 15 '09 at 14:20
  • Exactly, hence my question, but since you have an array that takes a lot of memory, I would suggest a different approach when you have memory efficiency in mind. – Yannick Motton Sep 15 '09 at 14:45
1

Have a look at System.arraycopy.

kgiannakakis
  • 103,016
  • 27
  • 158
  • 194
1

AFAIK the only way of growing or reducing an array is doing a System.arraycopy

   /**
    * Removes the element at the specified position in this list.
    * Shifts any subsequent elements to the left (subtracts one from their
    * indices).
    *
    * @param index the index of the element to removed.
    * @return the element that was removed from the list.
    * @throws    IndexOutOfBoundsException if index out of range <tt>(index
    *     &lt; 0 || index &gt;= length)</tt>.
    */
    public static <T> T[] removeArrayIndex(T[] src, int index) {
        Object[] tmp = src.clone();

        int size = tmp.length;
        if ((index < 0) && (index >= size)) {
            throw new ArrayIndexOutOfBoundsException(index);
        }

        int numMoved = size - index - 1;
        if (numMoved > 0) {
            System.arraycopy(tmp, index + 1, tmp, index, numMoved);
        }
        tmp[--size] = null; // Let gc do its work

        return (T[]) Arrays.copyOf(tmp, size - 1);
    }

   /**
    * Inserts the element at the specified position in this list.
    * Shifts any subsequent elements to the rigth (adds one to their indices).
    *
    * @param index the index of the element to inserted.
    * @return the element that is inserted in the list.
    * @throws    IndexOutOfBoundsException if index out of range <tt>(index
    *     &lt; 0 || index &gt;= length)</tt>.
    */
    public static <T> T[] insertArrayIndex(T[] src, Object newData, int index) {
        Object[] tmp = null;
        if (src == null) {
            tmp = new Object[index+1];
        } else {
            tmp = new Object[src.length+1];

            int size = tmp.length;
            if ((index < 0) && (index >= size)) {
                throw new ArrayIndexOutOfBoundsException(index);
            }

            System.arraycopy(src, 0, tmp, 0, index);
            System.arraycopy(src, index, tmp, index+1, src.length-index);
        }

        tmp[index] = newData;

        return (T[]) Arrays.copyOf(tmp, tmp.length);
    }    
Carlos Tasada
  • 4,438
  • 1
  • 23
  • 26
  • I should look this up before I make myself look silly (but time is an issue ATM), but don't you need a throws clause in the method declaration for the ArrayIndexOutOfBoundsException, or is that unchecked? – Gazzonyx Sep 15 '09 at 14:28
  • I would need to double check, but this code is copy&paste from an utility class that I've in production (but I'm not sure if this code is used anymore) ;) – Carlos Tasada Sep 15 '09 at 14:34
  • ArrayIndexOutOfBoundsException is definitely unchecked. – hjhill Sep 15 '09 at 18:34
1

Obviously, the important bit here is not if you concatenate the arrays or copy them over; what's more important is your array growing strategy. It's not hard to see that a very good way to grow an array is always doubling its size when it becomes full. This way, you will turn the cost of adding an element to O(1) as the actual growing stage will happen only relatively rarely.

Tamas Czinege
  • 118,853
  • 40
  • 150
  • 176
  • Yes, that's one important bit, but I was aware of that one. My question arises because I have a really large array that may occupy a large enough portion of memory that I can not just create a temporary copy. – Hanno Fietz Sep 15 '09 at 13:45
1

One way of doing this is having a linked list of array nodes. This is somewhat complex but the premise is this:

You have a linked list and each node within the list references an array. This way, your array can grow without ever copying. To grow you only need to add additional nodes at the end. Therefore the 'expensive' grow operation only occurs every M operations where M is the size of each node. Granted, this assumes that you always append to the end and you don't remove.

Insertion and removal in this structure is quite complicated, but if you can avoid them then that's perfect.

The only loss with this structure (ignoring insertion and deletion) is with the gets. The gets will be slightly longer; accessing the correct node requires accessing the correct node within the linked list and then fetching there. If there are a lot of accesses around the middle, this can be slow however there are tricks to speeding linked lists up.

Malaxeur
  • 5,973
  • 1
  • 36
  • 34
1

Have you looked at GNU Trove for highly efficient java collections? Their collections store primatives directly for much better memory usage.

Robert
  • 8,406
  • 9
  • 38
  • 57
0

Heres a benchmark of time taken to add and remove elements from a collection/arraylist/vector

Narayan
  • 6,031
  • 3
  • 41
  • 45