1) How important is using SparseIntArray
instead of the HashMap
?
It depends on how you are using it. But unless you are trying to represent many and/or large "arrays" like this, the difference is unlikely to be significant.
Note that Java SE doesn't have any sparse array classes, and it is not normally an issue.
2) Should I go through the trouble of making SparseIntArray
serializable? (My first idea is making a wrapper class that implements Serializable
, is this the right way to go?)
See above, and below.
Implementing a wrapper sounds reasonable ... if you need to go to this trouble. Another approach might be to declare a serializable subclass of SparseIntArray
. It would be advisable to declare custom readObject
and writeObject
methods.
The SparseIntArray
class (source code) uses a pair of int
arrays to represent the keys and values in the mapping, and uses binary search to do the lookup. The keys are kept in order with no "holes" and lookup is performed using binary search. This means the following:
Memory usage for a SparseIntArray
will be roughly a factor of 10 less that for an equivalent HashMap
. This is due to a combination of things:
the hash table array holds roughly 1 reference per entry (depending on how full the table is ...),
the keys and values have to be "boxed" as Integer
objects in a HashMap
, and
each entry in a HashMap
requires a "node" object that is fairly heavy weight - 4 fields in the standard implementation.
(However, if you create the Integer
objects the right way, the "boxing" overhead can be mitigated by the effects of the Integer
classes instance cache.)
By contrast, the sparse version requires 2 * capacity
4 byte words.
Lookup (i.e. get
) is O(logN)
compared with O(1)
for a HashMap
.
Random insertion is O(N)
compared with O(1)
for a HashMap
. (This is because an insertion has to move on average half of the existing entries so that the new entry can be added at the correct position in the arrays.)
Sequential insertion (i.e. in ascending in key order) is O(1)
.
So "which is best" clearly depends on what you are optimizing for, how you use the data structure, and how big it is going to get.