Why does my code take different values when i switch the order in a set (knowing that order doesn't matter with sets)

Question

I'm trying to implement a method that returns the edges of a graph, represented by an adjacency list/dictionary.

So to iterate through the dictionary, first I iterated through the keys, then through every value stored in the corresponding key. Inside the nested for-loop, I had a condition where, if a particular edge, say (a,b) is not in the set of edges, then add it to the set -- pass otherwise. On my first run, the method took in edges that are the same -- that is, in the set of edges, there are (a,b) and (b,a).

class Graph():
    def __init__(self, grph={}):
        self.graph = grph

    def get_vertices(self):
        for keys in self.graph:
            yield keys

    def get_edges(self):
        edges = set()
        for key in self.graph:
            for adj_node in self.graph[key]:
                if (key, adj_node) not in edges:
                    edge = (key, adj_node)
                    edges.add(edge)
                else:
                    pass
        return edges


def main():
    graph1 = {
        'A': ['B','C','D'],
        'B': ['A','E'],
        'C': ['A', 'D'],
        'D': ['A', 'C'],
        'E': ['B'],
    }
    graph_one = Graph(graph1)
    print(list(graph_one.get_vertices()))
    print(graph_one.get_edges())

if __name__ =='__main__':
    main()

the output is:

{('A','B'),('D','A'),('B','A'),('B','E'),('A','D'),('D','C'),('E','B'),('C','D'),('A','C'),('C','A')}

So what I did was that, I just changed the if statement:

"if (adj_node, key) not in edges:"

def get_edges(self):
        edges = set()
        for key in self.graph:
            for adj_node in self.graph[key]:
                if (adj_node, key) not in edges:
                    edge = (key, adj_node)
                    edges.add(edge)
                else:
                    pass
        return edges

Now the output was:

{('C','D'),('A','B'),('E','B'),('A','C'),('A','D')}

Im very curious as to why it is so, and I'd be so thankful if you guys could explain it to me. Thanks in advance!

you mean to `sort` the data or use an helper set with sorted data to avoid insertion. What's the type of key and adj_node? — Jean-François Fabre, Jan 13 '19 at 13:10
The check is irrelevant anyway because the set can only contain one of each value, so you can add the same value multiple times and you'll still only have one instance at the end in your set. — roganjosh, Jan 13 '19 at 13:11
sorry, i forgot to say (or emphasize, I suppose) that I'm not asking how to remove the duplicates! I was just wondering why it ignored the duplicates when I changed the condition on the if-statement — Francis Magnusson, Jan 13 '19 at 13:19
That's fine, my comment was somewhat tangential to what you're asking, but I still think it's valid. It's very possible I'm not seeing the full picture but I can't see the purpose of the check at all. — roganjosh, Jan 13 '19 at 13:20
So simple explaination is that `set` will only holds one instance of each value, and you used `if` to check opposite direction's tuple, so in effect both directions' duplicates are avoided. — Til, Jan 13 '19 at 13:47

score 10 · Accepted Answer · answered Jan 13 '19 at 13:21

10

When we say that sets have no order or that order doesn't matter, it means that {x, y} == {y, x}. But (a, b) and (b, a) are tuples, order matters for them, so (a, b) != (b, a) and therefore {(a, b), (b, a)} is a set with two distinct elements, although it's equal to {(b, a), (a, b)}.

When your code looks like this:

        if (adj_node, key) not in edges:
            edge = (key, adj_node)
            edges.add(edge)

then when the edge a <-> b is first encountered, it's as (key, adj_node) == (a, b) and is added to the set. When it's encountered the second (and only other) time, it's as (key, adj_node) == (b, a), meaning (adj_node, key) == (a, b) which is already in the set so (adj_node, key) not in edges is false and (b, a) doesn't get added to the set.

answered Jan 13 '19 at 13:21

Alex Hall

34,833
5
57
89

1

Looks like i got tuples and sets mixed up! thanks for pointing that out! It took me quite a while to understand your explanation on the if statement, but makes sense to me now; when it checks at "if (adj_node, key) not in edges" it still takes into account the tuple stored previously by the "edge = (key, adj_node)" statement, which is basically the same. Thanks! – Francis Magnusson Jan 13 '19 at 13:52
Equality testing and ordering are not quite the same- for example in Python 3.7 dicts retain insertion order but `{1:1,2:2} == {2:2,1:1}` remains `True` (not the case for `OrderedDict` though https://stackoverflow.com/questions/50872498/will-ordereddict-become-redundant-in-python-3-7) – Chris_Rands Jan 25 '19 at 16:25

Til · Answer 2 · 2019-01-14T03:53:33.053

3

I think it just needs a little change, try this:

def get_edges(self):
    edges = set()
    for key in self.graph:
        for adj_node in self.graph[key]:
            if ((key, adj_node) not in edges) and ((adj_node, key) not in edges):
                edge = (key, adj_node)
                edges.add(edge)
            else:
                pass
    return edges

Update:
So it is an Undigraph.
And it's me overcomplicated this.
And your way is actually better than my checking both ways.

The reason your code succeed, is that set will only contain one instance of any value.
So each time do the add, if there's already same tuple exists, it simply won't change the set.
And you already used the if to check the tuple of opposite direction, so it won't create duplicate edges.

For example, when (a, b) hits the if checking, it will check (b,a) exists in the set or not, if exists, then pass. If not, add (a, b) in the set, if (a, b) exists, the set won't change since only one instace will be in the set.
And later when looped to (b, a), since (a, b) already in the set, the if will be false and passed.
So by this way, the set is safe, free of duplicate edges.

edited Jan 14 '19 at 03:53

answered Jan 13 '19 at 13:14

Til

5,150
13
26
34

1

I still don't get the point of the check because the set will only contain one instance of any value – roganjosh Jan 13 '19 at 13:17
So just add them to the set. What is the `if` check preventing? "If it's not in the set, add it" is no different, from what I can see, to "add it to the set and let the set mechanics mean I only have one instance" – roganjosh Jan 13 '19 at 13:21
Or, just add both? The question has 3 upvotes so I'm guessing it's me that's missing something here, but I genuinely don't see it :) – roganjosh Jan 13 '19 at 13:26
@roganjosh That's exactly what OP wants to avoid, what's this question about, I think. – Til Jan 13 '19 at 13:27
@FrancisMagnusson It's me misunderstood your question, my apologies too :) – Til Jan 13 '19 at 13:55
1

@FrancisMagnusson that's not something to be sorry about. A big, and interesting, part of SO is helping people align their thinking :) – roganjosh Jan 13 '19 at 13:55
I'd like to say how surprised I am on how fast people here on stackoverflow responded to my question! didn't quite expected this, thank u very much – Francis Magnusson Jan 13 '19 at 13:57
looks like ill be in stackoverflow for quite a while :))) – Francis Magnusson Jan 13 '19 at 13:57
@FrancisMagnusson I'll still keep removing that fluffy opening statement on your question, though :P As long as the question is sufficiently presented, there's no need to state that you're new to the site. That has no value if someone comes across the question in a year's time – roganjosh Jan 13 '19 at 14:01
@FrancisMagnusson Folks here love to see a well formatted, own-efforts showing, non-obvious-duplicate question :) – Til Jan 13 '19 at 14:02

Why does my code take different values when i switch the order in a set (knowing that order doesn't matter with sets)

2 Answers2