Removing duplicate interaction pairs in python sets

Question

suppose you have a list of tuples in a python set:

>>> pairs = set( [(0,1),(0,1),(1,0)] )
>>> print pairs
set([(0, 1), (1, 0)])

Obviously, the first two elements are duplicates and according to the definition of a set, "pairs" only holds unique elements.

However, in my particular case the tuple (i,j) defines an interaction pair. Thus, (i,j) and (j,i) are the same. I need an effective way to reduce all duplicate elements. Computation time is crucial for me since the total set could easily contain a number of element as large as 10**6. I expect the following result:

>>> pairs = set( [(0,1),(0,1),(1,0)] )
>>> pairs = remove_duplicate_interactions(pairs)
>>> print pairs
set([0,1]) or set([1,0])

I am thankful for any hint.

Edit:

Someone asked about the context. This is supposed to be used for particle simulations. Due to symmetry conditions, the force of particle i acting on j is the same as the force of j acting on i. The reduction of computation time is thus 50 %.

No ordering in any dimension. set([(i,j),(k,l),..]) = set([ (k,l),(i,j),..]) = set([(j,i),(l,k),...]) ... — Rakulan S., Feb 29 '12 at 16:15

NPE · Answer 1 · 2012-02-29T16:13:26.047

10

How about:

In [4]: pairs = set( [(0,1),(0,1),(1,0),(1,2),(1,0),(2,1)] )

In [5]: set((a,b) if a<=b else (b,a) for a,b in pairs)
Out[5]: set([(0, 1), (1, 2)])

edited Feb 29 '12 at 16:13

answered Feb 29 '12 at 16:05

NPE

486,780
108
951
1,012

3

How is it even possible you people always find single-line solutions ? – Rakulan S. Feb 29 '12 at 16:24
6

@RakulanS.: devise harder problems :) – Fred Foo Feb 29 '12 at 16:31

score 4 · Answer 2 · answered Feb 29 '12 at 16:06

4

>>> pairs = [(0,1),(0,1),(1,0)]
>>> set(tuple(sorted(p)) for p in pairs)
set([(0, 1)])

answered Feb 29 '12 at 16:06

Fred Foo

355,277
75
744
836

Extremely helpful – kta Oct 27 '21 at 09:04

score 3 · Answer 3 · edited May 23 '17 at 12:08

3

I would subtype tuple and always make the pair ordered:

class InteractionPair(tuple):
    def __new__(cls, a, b):
        if a <= b:
            return tuple.__new__(cls, (a, b))
        return tuple.__new__(cls, (b, a))

Example:

>>> set([InteractionPair(0, 1), InteractionPair(1, 0)])
set([(0, 1)])

This can also be applied to a list of standard tuples:

from itertools import starmap
pairs = [(0, 1), (0, 1), (1, 0)]
print set(starmap(InteractionPair, pairs))

Edit: Her are some timings, using a list of 1000000 random pairs:

In [14]: timeit set(starmap(InteractionPair, pairs))
1 loops, best of 3: 742 ms per loop

In [15]: timeit set((a,b) if a<=b else (b,a) for a,b in pairs)
1 loops, best of 3: 258 ms per loop

In [16]: timeit set(tuple(sorted(p)) for p in pairs)
1 loops, best of 3: 1.02 s per loop

As far as performance is concerned, the solution by aix wins.

edited May 23 '17 at 12:08

Community

1
1

answered Feb 29 '12 at 16:07

Sven Marnach

574,206
118
941
841

@larsmans: Not sure if it really is. This approach might sometimes be useful, but you would need to know more about the context to judge if this is really worth it. – Sven Marnach Feb 29 '12 at 16:18
I like object-oriented approaches, so this seems quite elegant to me too. But how about efficiency ? Intuitively, this seems quite computation heavy ?! – Rakulan S. Feb 29 '12 at 16:20
@RakulanS.: Added a few simple timings. It's not that bad. – Sven Marnach Feb 29 '12 at 16:25

alexis · Answer 4 · 2013-04-17T08:09:32.567

2

If you can use frozenset instead of tuples, you can store each interaction pair as a set. An alternative is to store all your tuples in some canonical form (e.g., so that i < j). No idea if this has a speed advantage over frozenset, but the simpler the form, the faster python will be able to hash it and compute the set of unique elements.

edited Apr 17 '13 at 08:09

answered Feb 29 '12 at 16:33

alexis

48,685
16
101
161

score 0 · Answer 5 · edited May 23 '17 at 10:28

0

pairs = set(frozenset(p) for p in ((0,1),(0,1),(1,0)) )

Will achieve what you need. See my answer here for further information: How do I perform deep equality comparisons of two lists of tuples?

If necessary, you can reconstruct these as pairs afterwards.

edited May 23 '17 at 10:28

Community

1
1

answered Feb 29 '12 at 16:06

Marcin

48,559
18
128
201

Removing duplicate interaction pairs in python sets

5 Answers5

Linked