2

I have a class whose members are lists of numbers built by accumulating values from experimental data, like

 class MyClass:
     def __init__(self):
          container1 = []
          container2 = []
            ...

     def accumulate_from_dataset(self,dataset):
          for entry in dataset:
               container1.append( foo (entry) )
               container2.append( bar (entry) )
                  ...


     def process_accumulated_data(self):   
           '''called when all the data is gathered
           '''
           process1(container1)
           process2(container2)
             ...

Issue: it would be beneficial if I could convert all the lists into numpy arrays.


what I tried: the simple conversion

    self.container1 = np.array(self.container1)

works. Although, if I would like to consider "more fields in one shot", like

     lists_to_convert = [self.container1, self.container2, ...]

     def converter(lists_to_convert):
          for list in lists_to_convert:
               list = np.array(list)

there is not any effective change since the references to the class members are passed by value.


I am thus wondering if there is a smart approach/workaround to handle the whole conversion process.

Any help appreciated

Acorbe
  • 8,367
  • 5
  • 37
  • 66

2 Answers2

2

From The Pragmatic Programmer:

Ask yourself: "Does it have to be done this way? Does it have to be done at all?

Maybe you should rethink your data structure? Maybe some dictionary or a simple list of lists would be easier to handle?

Note that in the example presented, container1 and container2 are just transformations on the initial dataset. It looks like a good place for list comprehension:

foo_data = [foo(d) for d in dataset]
# or even
foo_data = map(foo, dataset)
# or generator version
foo_data_iter = (foo(d) for d in dataset)

If you really want to operate on the instance variables as in the example, have a look at getattr and hasattr built-in functions

Jakub M.
  • 32,471
  • 48
  • 110
  • 179
  • You are definitely right. It has been a design error. Now the class and the processing methods are unpleasantly large to change direction, though. – Acorbe Sep 02 '13 at 15:41
  • Abandon the ship! The code direction cannot be changed and it is heading for a sure maintenance disaster! :) – Jakub M. Sep 02 '13 at 15:44
1

There isn't an easy way to do this because as you say python passes "by-reference-by-value"

You could add a to_numpy method in your class:

class MyClass:
    def __init__(self):
        container1 = []
        container2 = []
        ...
def to_numpy(self,container):
    list = self.__getattr__(container)
    self.__setattr__(container,np.array(list))
        ...  

And then do something like:

object = MyClass()

lists_to_convert = ["container1", "container2" ...]

def converter(lists_to_convert):
    for list in lists_to_convert:
        object.to_numpy(list)

But it's not very pretty and this sort of code would normally make me take a step back and think about my design.

Mike Vella
  • 10,187
  • 14
  • 59
  • 86
  • 1
    I guess some sort of `getattr` is missing though. `np.array(container)` being `container` a string can't work. isn't it? – Acorbe Sep 02 '13 at 15:15
  • Don't call `__those_hidden__` functions, use `getattr` and `setattr` builtins – Jakub M. Sep 02 '13 at 15:32
  • @JakubM thanks for the comment, I think I agree with you but in your answer could you expand on why the builtins are particularly preferable to the hidden functions? – Mike Vella Sep 02 '13 at 15:37
  • 1
    That's why you have those builtins built in for :) This post will give you a good answer: http://stackoverflow.com/questions/1944625/what-is-the-relationship-between-getattr-and-getattr – Jakub M. Sep 02 '13 at 15:42