Is there duplication of data among Qt model items?

Question

There is a standard way of representing data for use in QAbstractItemModels, in which there is a separate data item class that stores data about an item, and its position in the model.

The simpletreemodel example (in Python) illustrates a pretty standard way of doing this, where the data is stored in a TreeItem that is initialized with:

class TreeItem(object):
    def __init__(self, data, parent=None):
        self.parentItem = parent
        self.itemData = data
        self.childItems = []

Note that each TreeItem instance contains its relevant item-specific data (self.itemData), but also has a list of its child items (which are TreeItems themselves), as well as its parent item (also a TreeItem) as attributes.

Why is this not a really inefficient use of memory? Let's focus on item B in the following tree diagram:

A
|
|--B
|  |--C
|  |
|  |--D
|
|-E

Naively, it seems B's data will be stored four times once the full data tree is constructed. It will be returned by A.childItems[0], C.parentItem, D.parentItem, and of course B.itemData.

Is there some Python magic under the hood that makes this not be an inefficient use of memory?

Note: even though these types of data containers are used a lot when QAbstract*Model is subclassed, these containers themselves actually contain no Qt (no Qt classes are being subclassed in the code for TreeItem). So this is really an elementary Python question, not a Qt question.

Possibly Relevant Post

How do I determine the size of an object in Python?

I don't know about Python, but in C++ `parentItem` and `childItems` will be pointers to relevant items. They do not contain any user data, so there's no any 'data duplication'. — hank, Dec 23 '14 at 04:40

score 3 · Accepted Answer · edited May 23 '17 at 10:33

3

In Python, variable names are references to objects so all these different expressions actually refer to the same object ie the same data in memory. There is no duplication of objects, although there is duplication of references. So in your example, A.childItems[0], C.parentItem, and D.parentItem are all references to the same object, B. Hence A.childItems[0].itemData, C.parentItem.itemData, D.parentItem.itemData all refer to the same B.itemData object.

Side Note related to the LEGB article mentioned in neuronet's comment: If any of A, B, C or D modify the object that itemData references (by calling a method or setting a property or data member on that object), then all of the others will see the change. However, if any of A.childItems[0], C.parentItem, or D.parentItem get re-assigned to a different object, it only affects the parent: A.childItems[0] = something only affects the first element in A.childItems; C.parentItem and D.parentItem, and any other references to B, still refer to B.

You might find the answers on Python variable reference assignment useful but I don't find them all that clear. I prefer Is Python pass-by-reference or pass-by-value, see what you think.

edited May 23 '17 at 10:33

Community

1
1

answered Dec 23 '14 at 04:48

Oliver

27,510
9
72
103

what you are saying makes intuitive sense, but then I remember reading things like "Anybody using the terms variable, reference or call-by-value is most likely explaining Python the wrong way " (http://learnpython.pbworks.com/w/page/15956522/Assignment) and then I start to get confused again. Is there a standard resource to consult to walk through the ins and outs of this? E.g., is the following good? http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/tutorials/scope_resolution_legb_rule.ipynb – eric Dec 23 '14 at 05:34
1

@neuronet given that the Python documentation talks about reference counts for objects, I think its fair to ignore the semantics the pbworks article is trying to enforce. – three_pineapples Dec 23 '14 at 08:10
1

@neuronet I read the two articles in your comment and the first one is too confusing, not worth reading; the second is a very good article but the referencing aspect is, although related, implicit rather than explicit. I have extended my answer and provided what I think are better "references" to articles ;) – Oliver Dec 23 '14 at 12:15
@Scholli very helpful stuff! That first article is funny and really good. Gets to the essence. I now appreciate that my question really is a basic Python question, as there is actually no Qt in there (the data class does not inherit from any of Qt's classes). – eric Dec 23 '14 at 14:29
@three_pineapples that is a really good point....I should start using 'reference' again in Python :) – eric Dec 29 '14 at 02:45

kartikg3 · Answer 2 · 2014-12-23T11:51:52.400

1

As @schollii and @hank pointed out, PyQt and PySide are Python wrappers of C++ based Qt. So if Qt used objects/pointers for something (like the parent and children in the tree as you described), then Python's corresponding variables also store references or pointer objects. Therefore there is no duplication of those objects in memory, just duplicate references or pointers to the same thing in memory.

These links are great reads on this subject:

https://www.commandprompt.com/community/pyqt/c2341

http://www.swig.org/Doc1.3/Python.html#Python_nn18

edited Dec 23 '14 at 11:51

answered Dec 23 '14 at 11:34

kartikg3

2,590
1
16
23

those are really great reference I didn't appreciate before thanks. As I mentioned previously, I now better appreciate that my question really comes down to basic Python, as the data container classes (like TreeItem) actually contain no Qt at all! Even though such containers are fairly ubiquitous use when QAbstract*Model is subclassed. – eric Dec 23 '14 at 14:31
1

But even if the underlying C++ data were not pointers but whole objects, the Python layer would still be using reference semantics for variable names, so you can't use the underlying C++ implementation as evidence of one or the other. – Oliver Dec 23 '14 at 14:48
@schollii that is great piece of info. Thank you. – kartikg3 Dec 23 '14 at 14:56

Is there duplication of data among Qt model items?

2 Answers2