0

Given the two following lists, one containing strings, one integers, how can I merge these two lists into a dictionary while ADDING the values for duplicate keys?

stringlist = ["EL1", "EL2", "EL1", "EL3", "El4"]

integerlist = [1, 2, 12, 4, 5]

So in the final dictionary I'd like EL1 to be 13, because it also contains 1 and 12.

resultdictionary = {}
for key in appfinal:
    for value in amountfinal:
        resultdictionary[key] = value
        amountfinal.remove(value)
        break

In this case, result dictionary removes any duplicate keys, but takes the last value that matches those keys. So, EL1 would be 12.

Any ideas? Thank you.

nsnro
  • 13
  • 2
  • 1
    Test if the dictionary already contains the key. if it does, add to the value instead of replacing it. – Barmar Jul 05 '22 at 15:57
  • 1
    Or use `defaultdict(int)` – Barmar Jul 05 '22 at 15:58
  • 1
    Don't use nested loops. Use `zip()` to iterate over both lists in parallel. – Barmar Jul 05 '22 at 15:58
  • FYI generally Python programmers, in contrast to Microsoft Win32 programmers, do not decorate their variable names with the variable's type. So, for example `resultdictionary` would typically be `result`. – jarmod Jul 05 '22 at 16:01
  • And don't remove stuff from `amountfinal` as you go; [it causes problems](https://stackoverflow.com/q/6260089/364696) if you change the size of a `list` while iterating it (not to mention being a lot slower; every `.remove` call is `O(n)`, and you do it `n` times, making the total work `O(n²)`, when bulk-clearing at the end with `amountfinal.clear()` would only require `O(n)` work total). – ShadowRanger Jul 05 '22 at 16:03

4 Answers4

3

Use defaultdict() to create a dictionary that automatically creates keys as needed.

Use zip() to loop over the two lists together.

from collections import defaultdict

resultdictionary = defaultdict(int)
for key, val in zip(stringlist, integerlist):
    resultdictionary[key] += val
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • For the specific case of a `dict` with numeric values, `collections.Counter` *might* be better (it depends on the use case). From what the OP gave us, `defaultdict(int)` is more performant (it's entirely implemented in C on CPython, `Counter` is a Python-level subclass of `dict` with a few C accelerators tossed in), but `Counter` might be more self-documenting or be more useful for other reasons (counting iterables, not creating keys by simple access, etc.). This is still the right answer, just noting `collections.Counter` so folks who see it know there are two reasonable options here. – ShadowRanger Jul 05 '22 at 16:07
  • @ShadowRanger This isn't counting, it's summing. Can `collections.Counter` do that? – Barmar Jul 05 '22 at 16:07
  • It has no special support for it, but it works the same as `defaultdict(int)` when you are doing `+=` like this (the `__missing__` implementation for `Counter` is literally just `return 0`, without inserting the key, so `resultdictionary[key] += val` will invoke `__missing__`, get `0`, leaving the `Counter` unchanged, add `val` to that `0`, *then* create the key-value mapping; `defaultdict(int)` would differ in that it would map `key` to `0` immediately, return the `0`, then add `val` and remap `key` to `0+val`). Net effect is identical either way. – ShadowRanger Jul 05 '22 at 16:12
  • In that case I don't see the point. The benefit of `collections.Counter()` is when you can just write `Counter(sequence)` and you don't have to write your own loop. – Barmar Jul 05 '22 at 16:14
  • Agreed, I mention it only because many times you need to do both summing and counting, or you don't want merely doing `mydict[x]` to create a mapping from `x` to `0` (you want it to report `0` but not actually bloat the dictionary), or you want to figure out the most common elements after summation with `.most_common`, combine or subtract counts, etc. As I said, for the OP's case, `defaultdict(int)` covers all their described needs, but `Counter` is *often* useful in cases like this. – ShadowRanger Jul 05 '22 at 16:16
1

One possible solution is to use dict.get with defaultvalue 0. For example:

stringlist = ["EL1", "EL2", "EL1", "EL3", "El4"]
integerlist = [1, 2, 12, 4, 5]

resultdictionary = {}
for s, i in zip(stringlist, integerlist):
    resultdictionary[s] = resultdictionary.get(s, 0) + i

print(resultdictionary)

Prints:

{'EL1': 13, 'EL2': 2, 'EL3': 4, 'El4': 5}
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
0

Zany solution for funsies:

stringlist = ["EL1", "EL2", "EL1", "EL3", "El4"]

integerlist = [1, 2, 12, 4, 5]

result = {}
result.update((k, result.get(k, 0) + v) for k, v in zip(stringlist, integerlist))
print(result)

Try it online!

It's almost a one-liner, but still needs result to be defined as an empty dict first, so it can use a genexpr that lazily checks the value summed so far as it goes.

I don't actually recommend this. For one thing, I'm not sure the language spec strictly requires that dict.update evaluate the argument provided lazily; if it tried to optimize by eagerly converting it to a dict first, then merging, this would fail. For another, it's easily broken (a maintainer might blindly convert the genexpr to a dictcomp or listcomp and now it's eager and broken). And the genexpr, while it technically has no side-effects itself, is relying on the side-effects of it being consumed by result.update, which is a distinctly non-functional design at odds with the functional style of genexprs.

E_net4
  • 27,810
  • 13
  • 101
  • 139
ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
-1

The quickest way w/o imports would be like this:

stringlist = ["EL1", "EL2", "EL1", "EL3", "El4"]

integerlist = [1, 2, 12, 4, 5]

#Convert two Lists into Dictionary using zip()
mergeLists = dict(zip(stringlist, integerlist))
user3667054
  • 181
  • 1
  • 9
  • That discards the first pairing, because the `"EL1": 1` mapping is *replaced* by, not summed with, the `"EL1": 12` mapping seen later. The OP requires summation, which can't be done like this. – ShadowRanger Jul 05 '22 at 17:45