8

How to add two sets and delete duplicates

>>> a = set(['a', 'b', 'c'])
>>> b = set(['c', 'd', 'e'])
>>> c = a + b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'set' and 'set'
>>>

Expected output:
c = set(['a','b','c', 'd', 'e']) 
abc
  • 165
  • 2
  • 10

4 Answers4

7

Try this:

>>> a = set(['a', 'b', 'c'])
>>> b = set(['c', 'd', 'e'])
>>> c = a.union(b)

Result:

set(['a','b','c', 'd', 'e'])

Delowar Hosain
  • 2,214
  • 4
  • 18
  • 35
3

Use union method

You want to use the union method of a set:

c = a.union(b)

https://docs.python.org/2/library/stdtypes.html#frozenset.union https://docs.python.org/3/library/stdtypes.html?highlight=sets#frozenset.union

The union method is the same as the | operator, so the line of code above is equivalent to

c = a | b

Use in-place operator

If you have no need to retain a or b, it would be better to use the update method, which will add the new members in place. That is,

a.update(b)

will produce the union in the existing data structure a. This is also performed by the equivalent code

a |= b

Sidenote: use set literal

In the code you provide, it would be faster to use the set literal notation with {element, element, ...}:

a = {'a', 'b', 'c'}

because it will execute twice as fast and not generate the unused list object.

Bennett Brown
  • 5,234
  • 1
  • 27
  • 35
1

You can get the union of both sets using the logical or operator |

a = set(['a', 'b', 'c'])
b = set(['c', 'd', 'e'])
c = a | b

print(c)

{'e', 'a', 'b', 'd', 'c'}

If you want the set ordered and as a list

c = sorted(list(c))
print(c)

['a', 'b', 'c', 'd', 'e']

JahKnows
  • 2,618
  • 3
  • 22
  • 37
  • The `|` operator is not the logical OR operator in this case. The operands would have to be Boolean for that operator to be applied. It is the same character, of course. – Bennett Brown Mar 12 '18 at 04:09
  • `set.union` is a distinct Python function from `bool.__or__`. They call different code, accept different arguments, and have different behavior. Only the latter, for example, uses shortcut evaluation. – Bennett Brown Mar 12 '18 at 04:31
  • You're right. Are you compiling everyone's answers into yours? – JahKnows Mar 12 '18 at 04:34
  • No, though I did add a section for in-place and a section advising use of the literal. I don't think anyone else had mentioned those. Is there something from someone's answer I should put in the answer I gave? And is compiling other's work frowned upon or encouraged in SO? – Bennett Brown Mar 12 '18 at 04:36
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/166651/discussion-between-bennett-brown-and-jahknows). – Bennett Brown Mar 12 '18 at 04:48
  • @BennettBrown `bool.__or__` does not short-circuit. The `or` operator is not overloadable, and the `|` operator (which i overloaded by `__or__`) is not short-circuiting. – abarnert Mar 12 '18 at 04:49
  • @abarnert, Thanks, you are right! Testing it, I see that logical operator `or` short circuits but that bitwise oeprator `|` and equivalent `__or__` do not short circuit. Is there a way to refer to the code called by `or`? – Bennett Brown Mar 12 '18 at 05:06
  • 1
    @BennettBrown: [Boolean operations](https://docs.python.org/3/reference/expressions.html#boolean-operations) defines `or`. It doesn't explain that you can't overload it, but [Emulating numeric types](https://docs.python.org/3/reference/datamodel.html#emulating-numeric-types) does explain that `__or__` is for `|`. See [PEP 335](https://www.python.org/dev/peps/pep-0335/) and the linked discussions for suggestions to change this up to 2011, when it was finally (after many years) rejected, and [PEP 532](https://www.python.org/dev/peps/pep-0532/) for a newer deferred-but-still-live proposal. – abarnert Mar 12 '18 at 05:17
  • @BennettBrown If you want more detail, this could make for a good question, where an answer could go into the reasons why it isn't trivial, the history, the design reasoning involved, the idiomatic way to simulate `or` in the C API, etc. By the way, you may have guessed this, but if you need a function that does `or`, you can't use `operator.or_`, because that's `|` again, so you need something like `lambda x, y: x or y`. – abarnert Mar 12 '18 at 05:20
  • @abarnert, thanks useful links! Based on https://stackoverflow.com/questions/8608587/finding-the-source-code-for-built-in-python-functions I found this for `or`:https://github.com/python/cpython/blob/master/Objects/boolobject.c – Bennett Brown Mar 12 '18 at 05:20
  • @BennettBrown: That's not `or`, that's `bool.__or__`, aka `|`. Notice that it converts to the integer 0 or 1 and does a C bitwise or `|`, not a C boolean or `||` (or just calls the `nb_or` slot, which is the C API way of calling `__or__`). – abarnert Mar 12 '18 at 05:22
-2
c = set(set(list(a) + list(b))

set() return <str set> so can't add two set together

https://docs.python.org/2/library/sets.html

Binh ED
  • 21
  • 1
  • What is `` supposed to mean, and why should it mean you can't add two sets together? Also, why would you convert the result to a set and then convert it to a set again? – abarnert Mar 12 '18 at 03:57
  • you can try print " print type(set(['a', 'b', 'c'])) - python 2.7 – Binh ED Mar 12 '18 at 03:59
  • Have you actually tried what you suggested? Because it gives ``, not ``. And of course `set()` doesn't return the same thing as `type(set())`; the set type is a type, not a set. And meanwhile, why would the name of the type control which operators it supports? It's the dunder methods (or type slots, for builtins) that control that. – abarnert Mar 12 '18 at 04:02