2

Given the following data structures ,what is the most efficient way to find out the intersection - keys which are common to both the data structures.

dict1 = {'2A':'....','3A':'....','4B':.....}  
list1 = [......,'2A','4B'.....]

Expected output = ['2A','4B']

I am fine to organize the list(not dict1) into any other data-structure if that yields a faster output too. Since this lookup has 2 be done for a large number of dicts - speed is vital.

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
IUnknown
  • 9,301
  • 15
  • 50
  • 76

2 Answers2

8

As suggested by @Blckknght

>>> dict1.viewkeys() & list1
set(['4B', '2A'])

This has to be the fastest and most efficient way. Note that dict.viewkeys() is dict.keys in Python 3 (don't confuse this with Python 2 where dict.keys() returns a list instead)

jamylak
  • 128,818
  • 30
  • 231
  • 230
  • 2
    It's worth noting that in Python 3, `viewkeys` has been renamed `keys` (and the original `keys` method doesn't exist any more). Furthermore, you can get even more concise using the `&` operator: `dict1.viewkeys() & list1` – Blckknght May 16 '13 at 01:47
  • @Blckknght I assumed that wouldn't work because it doesn't work with normal `set`s, any reason why that behaviour is enabled in this case (eg. it works for non-sets)? – jamylak May 16 '13 at 01:53
  • 1
    The dictionary key view type has an `__and__` (and `__rand__`) operator that accepts any sequence as its second argument. I think sets used to accept arbitrary sequences for those operations too, but apparently they don't any more. I'm not sure if it's a bug or not that the two types don't have the same limitations. – Blckknght May 16 '13 at 02:12
  • Note that `intersection(dict1.viewkeys())` has the same effect as the simpler `intersection(dict1)`, since `intersection()` accepts an iterable, and iterating over a dictionary returns its keys. – Eric O. Lebigot May 16 '13 at 04:09
  • @Blckknght: This is not a bug, this is done to "preclude error-prone construction". Reference: http://docs.python.org/2/library/stdtypes.html#set. – Eric O. Lebigot May 16 '13 at 04:13
  • 1
    @EOL: What I think is somewhat buggy is that the dictionary key view type doesn't precisely replicate the `set` (or `frozenset`) interface. Its API is very similar, but not the same. It would make a lot more sense for it to match `set` by having an `intersection` method that accepts arbitrary sequences, and for its `__and__` operator to reject non-set sequences. – Blckknght May 16 '13 at 04:24
  • @EOL I was under the impression that a `dict_keys` would be an exception since it is *set-like* but upon viewing the source code (http://www6.uniovi.es/python/dev/src/c/html/setobject_8c.html#a82 , http://www6.uniovi.es/python/dev/src/c/html/setobject_8h.html#a3) I see that `.intersection` only has that special optimization for `set` and `frozenset` subtypes and not the set-like `dict_keys` – jamylak May 16 '13 at 05:18
  • @Blckknght: Agreed. I feel the same. :) On the other hand, I like `viewkeys()`'s more flexible behavior… – Eric O. Lebigot May 17 '13 at 08:09
  • Since I am not on Python 3(as yet) ,I had to go with a set(dict1.keys()) & set(list1).Is this optimal? – IUnknown May 18 '13 at 01:48
  • @IUnknown No, that is far from optimal, this is a working Python 2 solution, have you even tried it? – jamylak May 18 '13 at 06:30
  • >>> dict1={'A':'x','B':'y'} >>> dict1.viewkeys() AttributeError: 'dict' object has no attribute 'viewkeys'(Python 2.6.5) – IUnknown May 18 '13 at 16:20
2

Use sets.

>>> set(dict1.keys()) & set(list1)
BenDundee
  • 4,389
  • 3
  • 28
  • 34