1

Let's say I have a dataset ('test.csv') like so:

Name,Fruit,Price
John,Apple,1.00
Steve,Apple,1.00
John,Mango,2.00
Adam,Apple,1.00
Steve,Banana,1.00

Although there are several easier ways to do this, I would like to organize this information as a class in python. So, ideally, an instance of a class would look like:

{'name': 'John', 'Fruits': ['Apple','Mango'], 'Price':[1.00, 2.00]}

My approach to loading the dataset into a class is to store each instance in a list.

class org(object):
    def __init__(self,name,fruit,price):
        self.name = name
        self.fruit = [fruit]
        self.price = [price]

    classes = []
    with open('test.csv') as f:
        for line in f:
            if not 'Name' in line:
                linesp=line.rstrip().split(',')
                name = linesp[0]
                fruit = linesp[1]
                price = linesp[2]
                inst = org(name,fruit,price)
                classes.append(inst)
    for c in classes:
        print (c.__dict__)
  1. In this case, how do I know if 'John' already exists as an instance?

  2. If so, how do I update 'John'? With a classmethod?

@classmethod
    def update(cls, value):
        cls.fruit.append(fruit)
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Rob
  • 395
  • 1
  • 2
  • 15
  • It's not exactly what you're asking but it can be modified to your needs: [Creating a singleton](https://stackoverflow.com/questions/6760685/creating-a-singleton-in-python). In your case you don't exactly want "only one instance" (singleton) but you do want "only one instance of each type", so see if those ideas help – Ofer Sadan Jun 15 '18 at 06:07
  • Are you open to using a dictionary instead of a list? – Mad Physicist Jun 15 '18 at 06:32
  • 1
    You really need to fix your indentation. The code you posted is extremely difficult to read, and not valid Python. – Mad Physicist Jun 15 '18 at 06:50

1 Answers1

2

There's no need for anything special to update your instances. Your class' attributes are public, so just access them for updating.

If you insist using a list as your instance container, you could do sth. like this:

classes = []
with open('test.csv') as f:
    for line in f:
        if not 'Name' in line:
            name,fruit,price=line.rstrip().split(',')
            exists = [inst for inst in classes if inst.name == name]
            if exists:
                exists[0].fruit.append(fruit)
                exists[0].price.append(price)
            else:
                classes.append(org(name,fruit,price))
for c in classes:
    print (c.__dict__)

However, I recommend using a dict instead, because it makes lookup and access to the instances easier

classes = {}
with open('test.csv') as f:
    for line in f:
        if not 'Name' in line:
            name,fruit,price=line.rstrip().split(',')
            if name in classes:
                classes.get(name).fruit.append(fruit)
                classes.get(name).price.append(price)
            else:
                classes.update({name: org(name,fruit,price)})

for c in classes.values():
    print (c.__dict__)

Both solutions will give you the same thing:

{'name': 'John', 'fruit': ['Apple', 'Mango'], 'price': ['1.00', '2.00']}
{'name': 'Steve', 'fruit': ['Apple', 'Banana'], 'price': ['1.00', '1.00']}
{'name': 'Adam', 'fruit': ['Apple'], 'price': ['1.00']}

For the sake of completeness, what @MadPhysicist down below in the comments probably means by a clunky way to update the dict is that I use the dict's methods instead of accessing the items by subscription.

# update existing instance in the dict
classes[name].fruit.append(fruit)

# add new instance to the dict
classes[name] = org(name, fruit, price)

I personally just find that somewhat ugly, hence I tend to use the methods :)

shmee
  • 4,721
  • 2
  • 18
  • 27
  • The idea to use a dict is correct. +1 for that. The way you update the dict is very clunky. – Mad Physicist Jun 15 '18 at 06:53
  • @MadPhysicist Care to elaborate why you consider it clunky? :) – shmee Jun 15 '18 at 06:56
  • If you're going to check for containment, use direct indexing instead of get to get the item. Also, it probably won't matter here, but it's much more efficient to retrieve the reference once and do two things to it than to look it up every time you want to do something to it. You may want to take a look at `dict.setdefault`, but you'll need a constructor to make an empty instance. – Mad Physicist Jun 15 '18 at 07:44
  • If you're going for efficiency, using a defaultdict in the first place, instead of dict.setdefault is supposed to be faster. But, as you said, it comes with the downside that the constructor would need changes to accept instantiation without arguments. If you wanted to keep the class the way it is, I'd argue that doing `exists = classes.get(name)`, update exists if it is True and otherwise update the dict with a new instance would be better because you keep the maximum number of lookups at 1. – shmee Jun 15 '18 at 08:36
  • Also, why `classes.update({name: org(name,fruit,price)})` instead of just `classes[name] = org(...)`? – Mad Physicist Jun 15 '18 at 13:11