0

I need help trying to write a function for detecting if there is a dictionary which contains multiple keys. So im stuck since when using name_to_age the dictionary will already be unique so I am not sure how I can apply this check.

for e.g

# the test function should raise an error since there is 2 duplicate keys
name_to_age = {'Sam': 2, 'Sam': 4, 'Alex: 14}
user1741339
  • 507
  • 2
  • 6
  • 11
  • 4
    The title concerns duplicate keys (which are not possible in Python dictionaries), but the question mentions multiple keys. Which do you mean? – Jim Witschey May 13 '15 at 23:48
  • You already know the answer. – Ignacio Vazquez-Abrams May 13 '15 at 23:48
  • 1
    Try to `print name_to_age` and see what happens when you override a key... – Nir Alfasi May 13 '15 at 23:51
  • 1
    At least in Python 3.4.3, in the interactive console, I was able to issue the command `A = { 'a': 1, 'a': 2 }` without raising an error. The result was the dictionary with the second combination only, which I believe replaced the first one. – Fernando Karpinski May 13 '15 at 23:52
  • 3
    @FernandoKarpinski it shouldn't raise an error because overriding an existing key is perfectly valid. As I wrote in the comment above - try to print your dictionary and see how it looks like. – Nir Alfasi May 13 '15 at 23:52
  • The usual way to have a "multidict" in python is just using a normal dict using lists as the values. Some web frameworks implement a custom class for this because it's a common structure for query strings / GET params - e.g. Django's `QueryDict` – wim May 14 '15 at 00:26
  • Updated my answer to allow for the possibility of duplicate keys and provided a function to detect them. – Brent Washburne May 15 '15 at 04:46

1 Answers1

1

The original question is about writing a function that can detect duplicate keys in a dictionary. Strictly speaking, a dictionary cannot contain duplicate keys but it can contain objects with identical values: Multiple identical keys in a Python dict - yes, you can!

The trick is to use objects instead of strings as keys, and to override the hash comparison (see How hash collisions are resolved in Python dictionaries):

class person(object):
    def __init__(self,name):
        self.name = name
    def __eq__(self, other):
        return False


alternate = {person("Andrew") : "Cambridge", person("Barbara") : "Bloomsbury", person("Andrew"): "Corsica"}
print alternate

The comments below debate the technical aspects of whether these are duplicate keys or not, but this is a side issue. Back to the original question, if a dictionary could actually hold duplicate keys, what function would detect them?

To find duplicate keys, you could do this:

def list_duplicates(d):
    seen = set()
    duplicates = set( x.name for x in d if x.name in seen or seen.add(x.name) )
    return list( duplicates )

(adapted from another question: Find and list duplicates in Python list)

Yes, multidict and querydict structures create lists of values for each key, but those don't address the original question's supposition of duplicate keys. This answer creates a scenario where duplicate "keys" are possible and provides a function to detect them.

Community
  • 1
  • 1
Brent Washburne
  • 12,904
  • 4
  • 60
  • 82
  • 1
    Disagree. The keys are the memory location of each object, which is unique. >>> [id(a) for a in alternate] Out[168]: [4489414544, 4489414352, 4489414480] – Alexander May 14 '15 at 00:16
  • If the keys were truly identical, then doing a check with `is` would yield True (e.g. person("Andrew") is person("Andrew") yields False because two separate objects are created, each residing at their own memory location). – Alexander May 14 '15 at 00:29
  • Also you need to maintain a reference to the exact instance used for the key, because `person.__eq__` isn't defined. That makes this approach more or less useless. – wim May 14 '15 at 00:31
  • How about now, with the `__eq__` method? – Brent Washburne May 14 '15 at 06:23
  • You've now violated the contract of `__eq__`, ie that `x.__eq__(y)` must imply `x.__hash__() == y.__hash__()` – Eric May 14 '15 at 12:49
  • OK, now it's back in the contract. – Brent Washburne May 14 '15 at 13:57
  • Updated to use any dictionary key. Don't know about this contract, @Eric, can you show a reference to it? – Brent Washburne May 14 '15 at 14:15
  • See [`__hash__`](https://docs.python.org/2/reference/datamodel.html#object.__hash__): _"The only required property is that objects which compare equal have the same hash value"_. You're inheriting `object.__hash__`, which conflicts with your `__eq__`. I'd argue that you do not want an `__eq__` here at all – Eric May 14 '15 at 16:25
  • That's how it started, in fact. @wim thought otherwise. – Brent Washburne May 14 '15 at 17:20
  • 1
    I don't think otherwise - my point is that to have a useful dict, you need a useful `__eq__` method (i.e. something better than identity). But if you implement `__eq__` you have to take care to implement `__hash__` correctly. There is no way to win with the approach proposed in this answer, if you need a multidict you should use a dict with lists as the values. – wim May 15 '15 at 00:19