8

I'm looking to write code that partitions a given set into disjoint subsets. For example, a set of football players and we partition them depending on the team they belong to. I want in the end a list of representatives, i.e. one player from each team.

All football players know all other player on their team -- this is very relevant for the complexity. So, my current idea on how to do this is as follows (where set is currently a LinkedHashSet<T>):

while (!set.isEmpty()) {
    E e = set.iterator().next();
    makeRepresentative(e);
    set.remove(AllPlayersOnSameTeamAs(e));
}

However, it feels weird to build a new iterator in every step of the while-loop. A LinkedHashSet is supposed to have some kind of firstElement() function internally (for its LinkedList behaviour), but for some reason I can't find how to do this. I also tried a foreach loop instead, but that resulted in a java.util.ConcurrentModificationException.

How am I supposed to do this correctly?

user1111929
  • 6,050
  • 9
  • 43
  • 73

2 Answers2

1
while (!set.isEmpty()) {    
    Collection<E> toBeRemoved = new ArrayList<E>();
    E first = set.iterator().next();
    doSomethingWith(e);
    for (E e : set) {
        if (similar(first, e)) toBeRemoved.add(e);
    }
    set.removeAll(toBeRemoved);
}

After reading your edit and understanding better, here's a solution you might like:

Collection<E> processed = new ArrayList<E>();
for (E e1 : set) {
    boolean similar = false;
    for (E e2 : processed) {
        if (similar(e1, e2)) similar = true;
    }
    if (!similar) {
        doSomethingWith(e1);
        processed.add(e1);
    }
}
set.clear();

Note that, without knowing anything more about the definition of "similar", this problem is inherently quadratic. The only way it could be made linear, or sub-quadratic, is if there was a way to hash similar elements to the same key. In that case, you could use the second strategy above, but modify the processed structure and the part that checks for previous similar elements to be more efficient (currently that step is linear in the number of similar groups, which could be linear in total elements).

Additionally, anything that's sub-quadratic will surely use more than constant memory. If you want constant memory, this is about the best you can do (it's definitely quadratic time):

while (!set.isEmpty()) {
    Iterator<E> iter = set.iterator();
    E first = iter.next();
    doSomethingWith(first);
    while (iter.hasNext()) {
        if (similar(first, iter.next())) iter.remove();
    }
}

Note that using iter.remove() fixes the concurrent modification problem you had previously.

Joe K
  • 18,204
  • 2
  • 36
  • 58
  • And that in a while loop? Unless I'm missing something obvious, that has quadratic complexity instead of linear... :/ – user1111929 Sep 12 '12 at 00:21
  • 1
    Why would it be in a while loop? Actually, now that I think about it, I'm not sure I'm totally clear on the original question. Why are you looping until the set is empty? If you're doing that, why not just do set.clear()? – Joe K Sep 12 '12 at 00:25
  • 1
    I edited my post, hopefully it is clear now? Apparently I forgot to mention that I also need to *use* the element e. In fact, my set is partitioned into subsets, and similar is in the same subset. So, I want to select one representative from each subset. – user1111929 Sep 12 '12 at 00:32
  • Yes, thanks, I understand now. I edited with a better answer. – Joe K Sep 12 '12 at 00:44
0

I'd do it in one pass, keeping track of the teams i'd seen.

Set<Team> processedTeams = new HashSet<>();
Set<Players> representatives = new HashSet<>();
for(e:players) {
  Team t = e.getTeam();
  if(processedTeams.contains(t))
    continue;
  processedTeams.add(t);
  representatives.add(e)
}
Michael Anderson
  • 70,661
  • 7
  • 134
  • 187