0

I have a dictionary where the keys are unique IDs and the values are a list one or more URLs. Some URLs have a scheme and some don't, so I want to "enrich" the scheme-less URLs with http://.

I'm doing this by (1) creating a new dict (hdct) with empty lists and the same keys; (2) iterating through the k,v pairs in the original dict (dct); (3) checking for a scheme; and (4) either appending the schemed url to the new dict or adding the scheme and appending afterward:

    hdct = dict.fromkeys(dct, [])
    for id, lst in dct.items():
        for url in lst:
            if 'http' in url is True:
                hdct[id].append(url)
            else:
                new = 'http://' + url
                hdct[id].append(new)

When I do this, my keys and values are somehow getting cross-pollinated:

dct - {'31212': ['websitea', 'websiteb'], '17759': ['websitec']}
hdct - {'31212': ['websitea', 'websiteb', 'websitec'], '17759': ['websitea', websiteb', 'websitec']}

I'm don't code for a living but have done much more complicated stuff than this, and it's driving me nuts- any help would be appreciated. I have a feeling I'm going to feel very dumb when someone posts the answer. Thanks!

  • 1
    `dict.fromkeys()` does not work the way you expect. [] is actually the same for all the keys. see: https://stackoverflow.com/questions/11509721/how-do-i-initialize-a-dictionary-of-empty-lists-in-python – ocean moist Jun 10 '23 at 22:54

1 Answers1

1

You only create a single list for the value here. One single list is created here:

dict.fromkeys(dct, [])

And then that one list is made the value for all keys. When you append to one, you append to all, because they're all the same list.

An easy way to get around this is just a comprehension:

hdct = { key:[] for key in dct }

This works because the left side of the comprehension is evaluated repeatedly, so a new list is created each time. With fromkeys though, the argument is only evaluated once, so only one list is created.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117