0

So I'm working on project for my algorithms class that I'm currently taking. I'm doing some research online and see that some people are using an ArrayList<Integer> and some people are using an int array[]. My question is, what's better to use for a min heap and why. The project requires me to keep the top 10000 largest numbers in the list from a very large list of numbers

kevorski
  • 816
  • 1
  • 11
  • 29

2 Answers2

2

If you know the array size at compile time, using a bare int[] array is faster. Of course, the performance difference is probably negligible -- but the idea is that ArrayList is internally implemented as an Object[] array, so you're saving yourself that overhead, plus the overhead of dealing with Integer vs int.

Community
  • 1
  • 1
Patrick Collins
  • 10,306
  • 5
  • 30
  • 69
  • Thanks for that. That's what I thought, since I know the size of the array needed it would be better to use a bare array. I was noticing that the array list was used for increasing the number of elements – kevorski Jul 30 '14 at 22:21
  • Yeah, I think the primary benefit is the `add` method that gives you a dynamically-resizing array (which you could do yourself, but which is a lot of boilerplate to write). If you know your max size, and you don't need any of the other `ArrayList` convenience methods, I don't think there is a compelling reason to use one. Also, please accept this answer if it solved your problem. – Patrick Collins Jul 30 '14 at 22:24
  • 1
    I'm not sure **at all* that an ArrayList wraps an int[]. It most like wraps an Object[] in which ints are autoboxed. Which implies 1) more memory consuptions (you'll have the `int`s AND the pointer to them), and 2) an autoboxing performance penalty. Granted, both may be negligeable at small sizes. – GPI Jul 30 '14 at 22:24
  • @PatrickCollins, not to be pedantic, but it's no Integer[] either ! It could not work with Generics if that was the case... It will create an Object[] which contains Integer instances, which is not the same as Integer[] (although memory wise, it is, it is practically different) – GPI Jul 30 '14 at 22:29
  • @GPI Won't these be initialized as `Integer[]` elements, and then possibly upcast to `Object[]`? Certainly they need to be initialized as `Integer`s if they're going to be `Integer`s later on, no? – Patrick Collins Jul 30 '14 at 22:34
  • Well : generics are erased at run time in java. The only thing that ties `new ArrayList()` to an `Integer` is... a generic type. That does not exist at runtime. So the JVM has NO WAY of knowing it has to build a Integer[]. If you look at the code, array list has a `private transient Object[] elementData;` as its container, and that's it. When you call add(int), int is boxed to Integer, and you have elementData[position++] = yourInteger. (with other checks before) – GPI Jul 30 '14 at 22:38
  • Okay. I'm still not quite solid on my understanding of Java generics and so I'll avoid muddling it up any more, but if you'd like, feel free to edit my answer to more accurately reflect what's going on here. – Patrick Collins Jul 30 '14 at 22:41
1

An int[] will consume less memory than an ArrayList<Integer>. Part of that is simply the overhead which is added from an Integer which adds ~16 bytes per instance. This video goes through memory impact of various objects and collections in 32bit and 64bit jvms. At about the 9:30 mark it talks about memory associated with each object. At about the 11:15 mark it talks about how much memory various types (including Object references) take. For an int[], you have 1 Object (the int[]) and it will actually contain all of the individual int values as contiguous memory. For an ArrayList<Integer>, you have the ArrayList object, the Object[] object and all of the Integer objects. Additionally, the Object[] doesn't actually contain the Integer objects in contiguous memory, rather it contains object references in contiguous memory. The Integer objects themselves are elsewhere on the heap.

So the end result is that an ArrayList<Integer> requires ~6x the amount of memory as an int[]. The backing Object[] and the int[] take the same amount of memory (~40,000 bytes). The 10k Integer objects take ~20 bytes each for a total of 200,000 bytes. So the ArrayList will be a minimum of 240,000 bytes compared to the int[] at approximately 40,000 bytes.

Brett Okken
  • 6,210
  • 1
  • 19
  • 25