0

I have a dictionary with 100 Cluster objects, the clusters have several Member objects and I need to add them to the Cluster they belong, my problem is that every Member es being added to every Cluster, and I can't find out why. Here's the code

self.clusters = {}

with open('/tmp/numpy_dumps/kmeansInput.txt.cluster_centres') as f:
    for line in f:
        cluster = Cluster(line)
        self.clusters[cluster.id] = cluster

with open('/tmp/numpy_dumps/kmeansInput.txt.membership') as f:
    for line in f:
        member = Member(line, self.reps)
        self.clusters[member.clusterId].members[member.imageId] = member

for id, cluster  in self.clusters.items():
    print(cluster)
    print(cluster.members)
    print('cluster {} has {} members'.format(id, len(cluster.members)))

The output tells me that every cluster has all the members

SkarXa
  • 1,184
  • 1
  • 12
  • 24

1 Answers1

1

The problem is very certainly in the Cluster class, that you did'nt post in your snippet. It's a bit of a wild guess but this behaviour is typical of shared attributes, either class attributes or mutable default arguments. If your Cluster class looks like one of the snippets below then looks no further:

# class attributes:

class Cluster(object):
    members = {} # this will be shared by all instances

# solution:

class Cluster(object):
    def __init__(self):
        self.members = {} # this will be per instance



# default mutable argument:

class Cluster(object):
    def __init__(self, members={}):
        # this one is well known gotcha:
        # the default for the `members` arg is eval'd only once
        # so all instances created without an explicit
        # `members` arg will share the same `members` dict
        self.members = members

# solution:

class Cluster(object):
    def __init__(self, members=None):
        if members is None:
            members = {}
        self.members = members
bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118