2

I'm trying to create a class that populates a list of structured data items along with some methods for populating the list from files and IO devices.

I'm having a problem with my method that fills out a new data structure and appends it to a list. It's set-up as a coroutine that fills up a temporary structure with data from the (yield) function. When it's done it appends the data to the list (e.g. self.list.append(newdata)). My problem is that this append happens by reference and I can't figure out how to initialize newdata to new memoryspace. What winds up happening is I have a list of data all pointing to the same data structure (e.g. "myclass.list[n] is myclass.list[m]" always yields TRUE). Can anyone tell me how to make this work?

If I were writing in C++, I would just need to do "newdata = new * mydatastructure;" after each loop iteration... I just can't figure out how to do this in python.... am I way off course here?

kjgregory
  • 656
  • 2
  • 12
  • 23
  • 3
    `new_data = MyDataStructure()` which calls `__init__` internally – Benjamin Gruenbaum Aug 05 '13 at 18:04
  • 1
    Or, for a shorter explanation, `new` is a kind of implementation detail, and in Python classes can be used like object factories which hide the part about `new`. So `new` in Python is hidden. – Dietrich Epp Aug 05 '13 at 19:01
  • 1
    Could you post your code (or preferably a [short, self-contained, runnable, stripped-down version of your code that demonstrates the problem when you run it](http://sscce.org/))? – user2357112 Aug 05 '13 at 20:01

2 Answers2

3

new is syntactic sugar for mydatastructure* = malloc(sizeof(mydatastructure)); (or something like that, it's been a while). It allocates the appropriate amount of memory on the heap for your what-have-you, and if you use the constructor (in C++) it initializes the memory.

Python takes care of this for you. Technically, there is a similar routine in Python, called __new__, which controls the allocation. But, you rarely need to override this on your objects.

The constructor for Python objects is called __init__. When you call __init__, __new__ is actually called first. So, when you construct objects in Python, you are automatically allocating new memory for them, and each one is different. As Benjamin pointed out, the constructor syntax (foo = Foo()) is the way you call __init__ without actually typing __init__().

Your problem lies elsewhere in your code, unfortunately.

By the way, if you really want to be sure that two variables reference the same object, you can use the id() function to get the reference number. The is keyword compares these reference numbers, in contrast to the == operator which uses the __eq__ method of objects to compare them.

Community
  • 1
  • 1
2rs2ts
  • 10,662
  • 10
  • 51
  • 95
  • 1
    Just to clarify: `__new__` has already been called... it's not "called as well" – Jon Clements Aug 05 '13 at 18:49
  • @JonClements Maybe I'm misunderstanding, but when you use a constructor, `__new__` gets called and then `__init__` gets called, right? That's what I meant. – 2rs2ts Aug 05 '13 at 19:41
  • `SomeObject()`'s `__new__` gets called to return the type (which needn't be the type actually called), whose `__init__` constructs the instance... It's one of those difficult to describe ones ;) I was just saying that `__new__` isn't a side effect of `__init__`... – Jon Clements Aug 05 '13 at 19:44
  • 1
    I think this summarises it nicely: http://www.python.org/download/releases/2.2/descrintro/#__new__ – Jon Clements Aug 05 '13 at 19:50
  • @JonClements Thanks for the link. I just worded it poorly. I changed my language to be a little more indicative of reality. – 2rs2ts Aug 05 '13 at 22:17
  • Wasn't sure how deeply they go into the object model (I noticed 4th year computing studies or something), so thought it might be handy for personal reference (I'm sure there's later articles/blogs about it), but anyway... – Jon Clements Aug 05 '13 at 22:21
2

My problem is that this append happens by reference and I can't figure out how to initialize newdata to new memoryspace.

If you're trying to append objects into a list by value, you might want to use something like copy.copy or copy.deepcopy to make sure what is being appended is copied.

>>> # The Problem
>>> class ComplexObject:
...     def __init__(self, herp, derp):
...         self.herp = herp
...         self.derp = derp
...
>>> obj = ComplexObject(1, 2)
>>> list = []
>>> list.append(obj)
>>> obj.derp = 5
>>> list[0].derp
5
>>> # obj and list[0] are the same thing in memory
>>> obj
<__main__.ComplexObject instance at 0x0000000002243D48>
>>> list[0]
<__main__.ComplexObject instance at 0x0000000002243D48>
>>> # The solution
>>> from copy import deepcopy
>>> list = []
>>> obj = ComplexObject(1,2)
>>> list.append(deepcopy(obj))
>>> obj.derp = 5
>>> list[0].derp
2
>>> obj
<__main__.ComplexObject instance at 0x0000000002243D48>
>>> list[0]
<__main__.ComplexObject instance at 0x000000000224ED88>

This is my attempt at actually solving your problem from your description without seeing any code. If you're more interested in allocation/constructors in Python, refer to another answer.

Jimmy Zelinskie
  • 1,450
  • 2
  • 12
  • 12
  • On the contrary, I believe this actually demonstrates a solution to the problem in his code rather than simply answering the title question of the post. – Jimmy Zelinskie Aug 05 '13 at 19:03
  • I think this could be a viable solution. I tried to do this earlier, problem is my data structure is another class. I think I need to implement my own copy functionality within that data structure, but I don't know how to do that either. – kjgregory Aug 05 '13 at 19:17
  • To my understanding, most Pythonistas will tell you to just use copy.deepcopy instead of writing a custom copy-constructor like you would in C++ or Java. You could always add a method to the class called copy(self) that just does "return copy.deepcopy(self)" – Jimmy Zelinskie Aug 05 '13 at 19:31