6

I need a Java data structure that I can efficiently add, delete, and access a random object.

This is what doesn't work:

ArrayList has efficient add (constant time), and random access (just "get" with a random integer), but deletes can take linear time because it has to potentially search the whole list for it.

TreeSet or HashSet has efficient add and delete, but I can't figure out how to get a random object.

Any ideas?

In theory, a B Tree would work, if I could traverse the tree myself with random Lefts or Rights, but I don't think a standard Java class gives me this ability.

I'm willing to use a third party library if nothing in the standard Java classes will work.

I do not need to support duplicates or nulls, nor does it need to be thread safe.

Thanks.

Magmatic
  • 1,754
  • 3
  • 19
  • 34
  • `ArrayList.remove(int index)` runs in amortized constant time. As far as I can tell, this means individual calls are linear time, but the average time over a series of calls approaches constant time. – Aarowaim May 05 '14 at 02:06
  • Thanks for that Aarowaim, but I wouldn't know the index of the object to remove. I suppose I could store that information separately in a HashMap or something, but as soon as I removed an object from the ArrayList, the other objects in the list would change indexes, which would mean some of the values in the HashMap would be wrong, and it would take me linear time to fix them. – Magmatic May 05 '14 at 02:20
  • @Magmatic Do you need to allow duplicate elements? – Boann May 05 '14 at 02:42
  • I do not need to support duplicates or nulls, nor does it need to be thread safe. – Magmatic May 05 '14 at 02:47
  • 1
    How many elements are you talking about? Hundreds? Thousands? Millions? Are they simple numbers, strings, or more complex objects? – Hot Licks May 05 '14 at 03:16
  • The objects are home-made classes. The number of elements is irrelevant, because efficient means efficient. I wanted something at least as efficient as O(log(n)) or O(1). O(n) is not good enough. – Magmatic May 07 '14 at 19:44
  • About arraylist: you can take a look to GapList: http://stackoverflow.com/a/24140749/3315914 – rpax Jun 11 '14 at 07:31

3 Answers3

9

You can get this with an ArrayList/HashMap pair (where the map will store the index of objects in the list) if you make list removal unordered. On removal, instead of shifting all subsequent elements in the list down one position, you just move the last element to fill the space of the removed item. Then there are at most two different elements that need to be touched during removal. A quick example (which I haven't tested and hope I didn't bungle):

class SpecialSet<E> extends AbstractSet<E> implements Set<E> {
    private final List<E> list = new ArrayList<>();
    private final Map<E,Integer> indexMap = new HashMap<>();

    @Override
    public boolean add(E e) {
        if (contains(e)) return false;
        indexMap.put(e, list.size());
        list.add(e);
        return true;
    }

    @Override
    public boolean remove(Object o) {
        Integer indexBoxed = indexMap.remove(o);
        if (indexBoxed == null) return false;
        int index = indexBoxed;
        int last = list.size() - 1;
        E element = list.remove(last);
        if (index != last) {
            indexMap.put(element, index);
            list.set(index, element);
        }
        return true;
    }

    public E removeRandom() {
        E element = list.get((int)(Math.random() * size()));
        remove(element);
        return element;
    }

    @Override
    public boolean contains(Object o) {
        return indexMap.containsKey(o);
    }

    @Override
    public Iterator<E> iterator() {
        return list.iterator();
    }

    @Override
    public int size() {
        return list.size();
    }
}

Depending on what object equality testing behavior you want, you maybe could swap out the HashMap for an IdentityHashMap for a little better speed.

Boann
  • 48,794
  • 16
  • 117
  • 146
  • Wow, thanks for the code! I just threw it into my project. (I only changed the "removeRandom" to "getRandom" and took out the line that removes it). It worked perfectly! I had been using an ArrayList with the inefficient remove, but when I changed to this code, the processing time dropped from 1900 ms to 628 ms. Wow, wow, wow! Thank you, thank you, thank you! – Magmatic May 05 '14 at 03:33
  • I don't see the importance of `HashMap` here. The most important trick here is: removal is not removing the element directly, it is removing the last element, and put the last element at the position that need to be removed. In such case, just a single `ArrayList` as internal structure will help – Adrian Shum May 05 '14 at 04:47
  • @AdrianShum You have to find an element before you can remove it; that's the part that the HashMap makes fast. – Boann May 05 '14 at 05:02
  • @Boann Oops, you are right, I overlooked the part of delete (I was thinking OP was asking for remove by index) – Adrian Shum May 05 '14 at 06:09
1

I'd propose that you use a Map, with an Integer key. That way you can generate a random number for the remove.

Problem left for the reader: on subsequent removes (or any operation) there will be gaps.

user949300
  • 15,364
  • 7
  • 35
  • 66
  • But how can you remove a value if you don't know its key? You would still need to do an inefficient linear search to find it. – Magmatic May 05 '14 at 02:45
1

Note that a simple array will do the job, if you pre-allocate to the max size you'll need. Keep a "top" index value and insert by incrementing "top" and storing into that array element. When you delete, copy the "top" value down into the deleted element and decrement "top".

If you don't know your max size you can still allocate an array of arrays and use the same basic algorithm, only you need to first compute the major and minor array index values before accessing an element.

Hot Licks
  • 47,103
  • 17
  • 93
  • 151