python 3 dictionary comprehension: Advanced

Question

Is it possible to convert this to a dict comprehension? How would I know what what can be converted and what cannot be?

for cpPtr in cpdomains:  
            domainKeys = self.domainSplitter(cpPtr.split(" ")[1])  
                for domainKey in domainKeys:  
                    if domainKey in domainAggregator.keys():  
                        domainAggregator[domainKey] = 
                         int(domainAggregator[domainKey]) + int(cpPtr.split(" ")[0])  
                    else:  
                        domainAggregator[domainKey] = int(cpPtr.split(" ")[0])

it would be so much easier to help you if you have included sample of the data. Or at least what does the final result look like. — GSazheniuk, Apr 04 '18 at 01:21
All of this can be done in a dict comprehension. But why would you want to do that? — Aran-Fey, Apr 04 '18 at 01:22
No, you can't really do this with a dict comprehension. In a dictionary comprehension, each key corresponds to one value. Here, you want to collect the sum of the values for a particular key, and the code you have doesn't care whether these values are anywhere near each other. (If they were, there might be a way to gather them up and sum them in one step, and this might make a dict comprehension possible.) — kindall, Apr 04 '18 at 01:26
If you are just looking to simplify this code, turn `domainAggregator` into a `collections.Counter` object and use `Counter.update` with a dictionary of counts. — Blender, Apr 04 '18 at 01:31

score 2 · Answer 1 · answered Apr 06 '18 at 15:52

Indeed, you can use a dict comprehension, but you shouldn't. If you are ready, let's go.

I will start with a little cleanup of your code.

First, I replace cpPtr.split(" ")[0] and cpPtr.split(" ")[1] with u and v, where u,v = cpPtr.split(" "). It's a destructuring assignement.

Then I inline the following lines:

if domainKey in domainAggregator.keys():
    domainAggregator[domainKey] = int(domainAggregator[domainKey]) + int(u)
else:
    domainAggregator[domainKey] = int(u)

into:

domainAggregator[domainKey] = int(domainAggregator[domainKey])+int(u) if domainKey in domainAggregator.keys() else int(u)

I get now the following code:

for cpPtr in cpdomains:
    u,v = cpPtr.split(" ")
    for domainKey in self.domainSplitter(v):
        domainAggregator[domainKey] =  int(domainAggregator[domainKey])+int(u) if domainKey in domainAggregator.keys() else int(u)

and I want to turn it in a dict comprehension.

Now, some little hacks.

You can't use assignements inside a list or dict comprehension. In order to put that in a list comprehension (and then a dict comprehension, see later), I need to get rid of =s:

u, v = cpPtr.split(" ")
...
    domainAggregator[domainKey] =  int(domainAggregator[domainKey])+int(u) if domainKey in domainAggregator.keys() else int(u)

Becomes:

for u, v in [cpPtr.split(" ")]: # this is a singleton
    ...
    domainAggregator.update({domainKey:int(domainAggregator[domainKey])+int(u) if domainKey in domainAggregator.keys() else int(u)}) # I update the dict

The code is now:

for cpPtr in cpdomains:
    for u,v in [cpPtr.split(" ")]:
        for domainKey in self.domainSplitter(v):
            domainAggregator.update({domainKey:int(domainAggregator[domainKey])+int(u) if domainKey in domainAggregator.keys() else int(u)})

Now, I can use a list comprehension to create a side-effect. You just have to insert everything inside square brackets:

[domainAggregator.update({domainKey:int(domainAggregator[domainKey])+int(u) if domainKey in domainAggregator.keys() else int(u)}) for cpPtr in cpdomains for u,v in [cpPtr.split(" ")] for domainKey in self.domainSplitter(v)]

Okay, after the evaluation of this expression, domainAggregator has the expected value, but the list is only a list of Nones, since dict.update returns None.

The following codes are totally insane. Do not try to do this at home!!!

Is it possible to create a dict comprehension that returns the right value? Yes! You have to create the side effect inside the dict comprehension. Let's try this:

# does not work
{k:v for k, v in domainAggregator if len( [domainAggregator.update({domainKey:(int(domainAggregator[domainKey]) if domainKey in domainAggregator.keys() else 0) + int(u)}) for cpPtr in cpdomains for u,v in [cpPtr.split(" ")] for domainKey in self.domainSplitter(v)])>=0}

The expression:

len([domainAggregator.update({domainKey:(int(domainAggregator[domainKey]) if domainKey in domainAggregator.keys() else 0) + int(u)}) for cpPtr in cpdomains for u,v in [cpPtr.split(" ")] for domainKey in self.domainSplitter(v))>=0

is always True and you have the desired side-effect. But this side-effect is performed each time you iterate over the items, and you get the infamous RuntimeError: dictionary changed size during iteration.

To avoid this, you have to trigger the side effect only once.

{k:v for da in [domainAggregator for _ in range(1) if len ([domainAggregator.update({domainKey:(int(domainAggregator[domainKey]) if domainKey in domainAggregator.keys() else 0) + int(u)}) for cpPtr in cpdomains for u,v in [cpPtr.split(" ")] for domainKey in self.domainSplitter(v)])>=0] for k,v in da.items()}

The inner list tiggers the side effect and returns the dict, the outer dict comprehension iterates over the items and creates the dict.

Do not do this!!!

A MVCE: Try it online!

score 1 · Answer 2 · answered Apr 04 '18 at 01:30

We can't really turn this into a dictionary comprehension, for the reason I explained in my comment, but we can certainly simplify what you have. Using get() to access the domainAggregator element, we can provide a default of 0 if the key doesn't exist, eliminating that if statement.

for cpPtr in cpdomains:  
    domainKeys = self.domainSplitter(cpPtr.split(" ")[1])  
    for domainKey in domainKeys:
         domainAggregator[domainKey] = (
             domainAggregator.get(domainKey, 0) + int(cpPtr.split(" ")[0]))

python 3 dictionary comprehension: Advanced

2 Answers2