using dict.update is overwriting the string of the key itself AND the key's value

Question

I am working on a CLI for AWS, and I'm trying to get all instances of all EC2 into a single dict, across all regions.

    def get_all_ec2_instances_in_all_regions(self):
        ec2_instance_list = {}
        region_list = list_ec2_instance_regions() #this returns a list of regions to iterate on
        print('Finding Regions')
        for region in region_list:
            region_name = region['RegionName']
            ec2 = boto3.client('ec2', region_name=region_name)
            regional_instance_list_return = ec2.describe_instances()['Reservations']
            
            if len(regional_instance_list_return) == 0:
                ec2_instance_list.update({
                    region_name: regional_instance_list_return
                })

            for reservation in regional_instance_list_return:
                instance_id= reservation['Instances'][0]['InstanceId']
                ec2_instance_list.update({
                    region_name: {
                        instance_id: reservation['Instances'][0]
                    }
                })

        print('Region search complete')
        print(prettyPrintDict(ec2_instance_list)) #prettyPrintDict just console logs the dict in a nicer format for human readability

The resulting object only has one single dict object with the instance_id as the key, even though regional_instance_list_return actually has a list with multiple objects in it.

I would figure this code would add dictionaries with each InstanceID Like this

{ region1_name: 
  { instance_id1: {instance1 data},
    instance_id2: {instance2 data},
    instance_id3: {instance3 data}
  }
  region2_name:
  { instance_id1: {instance1 data},
    instance_id2: {instance2 data},
    instance_id3: {instance3 data},
    instance_id4: {instance4 data}
  }
... and so on
}

but the resulting dict actually looks like this when it's finished:

{ region1_name: 
  { 
    instance_id3: {instance3 data}
  }
  region2_name:
  { 
    instance_id4: {instance4 data}
  }
... and so on
}

It doesn't actually add each instance, it just overwrites the instance_id key (Which is unique for each instance) and the key's values.

I was under the impression that if a key is unique, and you use dict.update() it'll just add them all without overwriting? What am I doing wrong?

https://stackoverflow.com/questions/7204805/how-to-merge-dictionaries-of-dictionaries — jellycsc, Jun 25 '21 at 15:40

score 0 · Answer 1 · answered Jun 25 '21 at 15:49

0

dict.update takes keyword assignment arguments and converts the keyword into the dict key. You are telling it that the key is 'instance_id', not the value assigned to the variable instance_id. Instead of dict.update, try the syntax dict[key] = value, which appends a new key/value pair if key does not already exist.

answered Jun 25 '21 at 15:49

smp55

403
3
8

When I look at it in the debugger, it's assigning the key value to `instance_id` as expected. but then when I go to the next loop, it replaces whatever that previous key value was with a new one, and only leaves me with the resulting shape of `{dict: 1}` even though it should be like `{dict: 36}` in some cases I switched it over to what you suggested here and it was still overwriting the key value pair. – glitchwizard Jun 25 '21 at 16:05

score 0 · Accepted Answer · answered Jun 25 '21 at 16:21

Ok so in the original code, it was overwriting the region each time with the dict.update so I had to make a few small modifications to make sure that the update was happening in the right leaf of the object:

    def get_all_ec2_instances_in_all_regions(self):
        ec2_instance_list = {}
        region_list = self.list_ec2_instance_regions()
        print('Finding Regions')
        for region in region_list:
            region_name = region['RegionName']
            self.ec2 = boto3.client('ec2', region_name=region_name)
            regional_instance_list_return = self.ec2.describe_instances()['Reservations']
            ec2_instance_list.update({
                region_name: {}
            }) #update the region name no matter if there are instances or not

            for reservation in regional_instance_list_return:
                instance_id = reservation['Instances'][0]['InstanceId']
                ec2_instance_list[region_name].update({  
                     # update the list starting at the 
                     # root of the region_name, not one 
                     # level up the leaf as I had before
                        instance_id: reservation['Instances'][0]
                })

Can you give a minimum working example that displays the same behavior, but which doesn't rely on external function calls (i.e. list_ec2_instance_regions)? Maybe just define some variables to stand in for what those functions return? That makes it easier for others to replicate what you observe and try solutions. — smp55, Jun 29 '21 at 18:11

using dict.update is overwriting the string of the key itself AND the key's value

2 Answers2