Python Dictionary combining by current expression

Question

I have two dictionaries:

x = [{'policy': 'a-b-windows2007',  'starttime': '4', 'duration': '5'}, 
     {'policy': 'ab-server2012', 'starttime': '4', 'duration': '5'}, 
     {'policy': 'Aa-windows', 'starttime': '4', 'duration': '5'}]

y = [{'policy': 'Windws a-b-windows2007', 'total_hosts': '160'},
     {'policy': 'Windows ab-server2012', 'total_hosts': '170'},
     {'policy': 'Windows Aa-windows', 'total_hosts': '180'}]

I want to have one dict by combining x and y if policy in x = policy in y. I have created regex and I am struggling how to merge them

x and y are not the same length.

My attempt so far:

 for key in x:
     for keys in y:
         if key['policy'] == re.match('[0-9]+|\b[a-z-]+(\d)',keys['policy']):
             z.update(y)

Wanted output:

z=[{policy: 'a-b-windows2007',starttime: '4', duration: '5',total_hosts:'160'}, 
   {policy: 'ab-server2012',starttime: '4', duration: '5',total_hosts:'170'}, 
   {policy: 'Aa-windows',starttime: '4', duration: '5',total_hosts:'180'}]

What do you do if there is a policy in one that is not in the other? — Patrick Haugh, Nov 15 '16 at 14:33
@MooingRawr check out now. The think is that policy: 'Windows Aa' == policy: 'Aa' — Maltesse, Nov 15 '16 at 15:04
a has no relation to Aa... "Aa" is equal to "Windows Aa", "a" is equal to "Windows a" @MooingRawr — Maltesse, Nov 15 '16 at 15:08
@Maltesse Then your output doesn't match what you want: `{policy: 'a',starttime: '4', duration: '5',total_hosts:'160'}` why did this get an `'a' total_host` value? — MooingRawr, Nov 15 '16 at 15:10
What you want is called joining operation. Also, please describe better how policies should be compared. They're not equal, and, probably, contain mistypes. You have to look at pandas: code will be faster and more concise. — George Sovetov, Nov 15 '16 at 15:14
@GeorgeSovetov I have edited the code, now it should be clearer — Maltesse, Nov 15 '16 at 15:22

score 2 · Accepted Answer · answered Nov 15 '16 at 15:33

Your regex wasn't working for me, here's a nested for loop solution assume t hat your policy follows: this format <windows> <version_number> we split the policy value and take the version_number to compare, you can easily convert it to a dict comprehension if you so wish too..

x = [{'policy': 'a-b-windows2007',  'starttime': '4', 'duration': '5'}, 
     {'policy': 'ab-server2012', 'starttime': '4', 'duration': '5'}, 
     {'policy': 'Aa-windows', 'starttime': '4', 'duration': '5'}]

y = [{'policy': 'Windows a-b-windows2007', 'total_hosts': '160'},
     {'policy': 'Windows ab-server2012', 'total_hosts': '170'},
     {'policy': 'Windows Aa-windows', 'total_hosts': '180'}]

for x_dict in x:
    for y_dict in y:
        if x_dict['policy'] == y_dict['policy'].split(' ')[1]:
            if "total_hosts" in x_dict:
                x_dict["total_hosts"].append(y_dict["total_hosts"])
            else:
                x_dict["total_hosts"] = y_dict["total_hosts"]

print(x)

Gives:

[{'starttime': '4', 'duration': '5', 'policy': 'a-b-windows2007', 'total_hosts': '160'}, 
{'starttime': '4', 'duration': '5', 'policy': 'ab-server2012', 'total_hosts': '170'}, 
{'starttime': '4', 'duration': '5', 'policy': 'Aa-windows', 'total_hosts': '180'}]

This solution updates the x list so if you want a new list without changing x, just make a copy of x to change called z and change the for loops where ever x is make it to z...

Umm not it's not ? x is a list of dictionary objects..... and it matches your output, I really don't understand what you are asking now... @Maltesse — MooingRawr, Nov 15 '16 at 15:45
No, you are right, that was just my typo mistake. Thank you! — Maltesse, Nov 15 '16 at 15:49

score 1 · Answer 2 · edited May 23 '17 at 10:30

Try this.

merge_dicts is necessary evil from here.

I assume that Windws is mistype. Otherwise, you have to specify join condition clearer.

Turning y into indexed dictionary yield good performance gain over nested for loops.

def merge_dicts(x, y):
    z = x.copy()
    z.update(y)
    return z

y_indexed = {e['policy']: e for e in y}
joined = [
    merge_dicts(y_indexed['Windows ' + e['policy']], e)
    for e in x]

Consider using pandas if you have lots of such dicts.

score 1 · Answer 3 · answered Nov 15 '16 at 15:39

You don't really need a regular expression in this particular case; but it's not hard to modify the code to include one.

You can do something similar to this:

l=[]
for xItem in x:
  for yItem in y:
    if yItem['policy'].endswith(xItem['policy']):
      tmpItem=xItem
      tmpItem['total_hosts'] = yItem['total_hosts']
      l.append(tmpItem)

A bit inefficient, but sorting the lists beforehand will help, but only if the lists a large enough for the sorting time to be amortized.

Patrick Haugh · Answer 4 · 2016-11-15T15:38:00.137

0

This assumes that the lists are the same length and that every element in x has a corresponding element in y.

Sort the lists so the matching dictionaries share an index, then zip the two together. The use itertools.chain to feed them to the dict constructor.

import itertools
x.sort(key=lambda x: x['policy'])
y.sort(key=lambda x: x['policy'])
z = [dict(itertools.chain(a.items(), b.items())) for a, b in zip(x, y)]

I think on more recent versions of python you can do dict(**a, **b), but I'm using 3.3 on this computer so I can't be sure.

Another way of doing it would be to convert y, the list that has no duplicate policies, into a dictionary.

y_dict = {d['policy'].split()[-1]: d for d in y}

.split()[-1] will give us the last word of the policy entry. Then we can go through x to build our new list.

z = []
for d in x:
    new_dict = {k:v for k,v in d.items()}
    new_dict.update({k:v for k, v in y_dict[d['policy']] if k != 'policy'})
    z.append(new_dict)

edited Nov 15 '16 at 15:38

answered Nov 15 '16 at 14:41

Patrick Haugh

59,226
13
88
96

They are not the same size unfortunately. Each key has corresponding value. – Maltesse Nov 15 '16 at 14:50
Does every `x` have exactly one `y`? Does every `y` have exactly one `x`? Are there some policy values that have duplicates in both lists? – Patrick Haugh Nov 15 '16 at 14:53
policy in dict x: compliance-smth policy in dict y: Windows compliance-smth They are the same if I am using expression result = re.match('[0-9]+|\b[a-z-]+(\d)', y['policy']) – Maltesse Nov 15 '16 at 14:56
So which policy do you want the end result to have? – Patrick Haugh Nov 15 '16 at 15:20
Check the edit, I just explained better what it should be – Maltesse Nov 15 '16 at 15:23
Are there entries in `x` without corresponding entries in `y`, or entries in `y` without corresponding entries in `x`? – Patrick Haugh Nov 15 '16 at 15:24
entries in x without corresponding entries in y. But the policies are the same so it should just copy the number of hosts – Maltesse Nov 15 '16 at 15:28

score 0 · Answer 5 · answered Nov 15 '16 at 16:27

In your example, the endswith method is much simpler (and probably more robust) than a regex.

z = {}
for key in x:
    print(key['policy'])
    for keys in y:
        print(keys['policy'])
        if keys['policy'].endswith(key['policy']):
            kz = key.copy()  # copy to avoid any change in x
            kz['total_hosts'] = keys['total_hosts']
            z.append(kz)

Python Dictionary combining by current expression

5 Answers5