11

I want to subtract two ArrayLists so I can have the child that are not in the other list.

I do it this way:

removeIDs=(ArrayList<Integer>) storedIDs.clone();
removeIDs.removeAll(downloadedIDs);

downloadIDs=(ArrayList<Integer>) downloadedIDs.clone();
downloadIDs.removeAll(storedIDs);

The Problem is that both lists contain 5000childs and it takes almost 4 seconds on my androidphone.

Is there a fast way to do this? Is using sets faster?(i dont have duplicate values in the lists)

I develop an android app

Michael Celey
  • 12,645
  • 6
  • 57
  • 62
user1886411
  • 235
  • 2
  • 9

5 Answers5

7

Use HashSet instead of ArrayList unless you need to keep the order.

Removing an element requires a scan of the full List for list implementations, a HashSet by comparison is just the calculation of a hash code and then identification of a target bucket.

Chris Cooper
  • 4,982
  • 1
  • 17
  • 27
1

Sets should be must faster. Right now, it's basically doing an n^2 loop. It loops over every element in removeIDs and checks to see if that id is in downloadedIDs, which requires searching the whole list. If downloadedIDs were stored in something faster for searching, like a HashSet, this would be much faster and become an O(n) instead of O(n^2). There might also be something faster in the Collections API, but I don't know it.

If you need to preserver ordering, you can use a LinkedHashSet instead of a normal HashSet but it will add some memory overheard and a bit of a performance hit for inserting/removing elements.

1

I agree with the HashSet recommendation unless the Integer IDs fit in a relatively small range. In that case, I would benchmark using each of HashSet and BitSet, and actually use whichever is faster for your data in your environment.

Patricia Shanahan
  • 25,849
  • 4
  • 38
  • 75
1

First of all I am giving my apology for the long answer. If I am wrong at any point you are always welcome to correct me. Here I am comparing some options of solving the solution

OPTION 1 < ArrayList >:

In your code you used the ArrayList.removeAll method lets look in to the code of removeAll

the source code of removeAll

public boolean removeAll(Collection<?> c) {
        return batchRemove(c, false);
    }

so need to know what is in batchRemove method. Here it is link. The key part here if you can see

for (; r < size; r++)
         if (c.contains(elementData[r]) == complement)
                 elementData[w++] = elementData[r];

now lets look into the contains method which is just a wrapper of indexOf method. link. In the indexOf method there is a O(n) operation. (noting just a part here)

 for (int i = 0; i < size; i++)
            if (elementData[i]==null)
                    return i;

So over all it is a

O(n^2)

operations in the removeAll

OPTION 2 < HashSet >: previously I wrote something in here but it seems I was wrong at some point so removing this. Better take suggestion from expert about Hashset. I am not sure in your case whether hashmap will be a better solution. So I am proposing another solution

OPTION 3 < My Suggestion You can try>:

step 1: if your data is sorted then no need of this step else sort the list which you will subtract(second list)

step 2: for every element of unsorted list run a binary search in the second list

step 3: if no match found then store in another result list but if match found then dont add

step 4: result list is your final answer

Cost of option 3:

step 1: if not sorted O(nlogn) time

step 2: O(nlogn) time

step 3: O(n) space

**

so overall O(nlogn) time and O(n) space

**

stinepike
  • 54,068
  • 14
  • 92
  • 112
0

If a list is required, you can choose a LinkedList. In your case, as @Chris said, the ArrayList implementation will move all the elements in each removal.

With the LinkedList you would get a much better performance for random adding/removing. See this post.

Community
  • 1
  • 1
Igor Rodriguez
  • 1,196
  • 11
  • 16