12

I've recently hit a wall in a project I'm working on which uses PyQt. I have a QTreeView hooked up to a QAbstractItemModel which typically has thousands of nodes in it. So far, it works alright, but I realized today that selecting a lot of nodes is very slow. After some digging, it turns out that QAbstractItemModel.parent() is called way too often. I created minimal code to reproduce the problem:

#!/usr/bin/env python
import sys
import cProfile
import pstats

from PyQt4.QtCore import Qt, QAbstractItemModel, QVariant, QModelIndex
from PyQt4.QtGui import QApplication, QTreeView

# 200 root nodes with 10 subnodes each

class TreeNode(object):
    def __init__(self, parent, row, text):
        self.parent = parent
        self.row = row
        self.text = text
        if parent is None: # root node, create subnodes
            self.children = [TreeNode(self, i, unicode(i)) for i in range(10)]
        else:
            self.children = []

class TreeModel(QAbstractItemModel):
    def __init__(self):
        QAbstractItemModel.__init__(self)
        self.nodes = [TreeNode(None, i, unicode(i)) for i in range(200)]

    def index(self, row, column, parent):
        if not self.nodes:
            return QModelIndex()
        if not parent.isValid():
            return self.createIndex(row, column, self.nodes[row])
        node = parent.internalPointer()
        return self.createIndex(row, column, node.children[row])

    def parent(self, index):
        if not index.isValid():
            return QModelIndex()
        node = index.internalPointer()
        if node.parent is None:
            return QModelIndex()
        else:
            return self.createIndex(node.parent.row, 0, node.parent)

    def columnCount(self, parent):
        return 1

    def rowCount(self, parent):
        if not parent.isValid():
            return len(self.nodes)
        node = parent.internalPointer()
        return len(node.children)

    def data(self, index, role):
        if not index.isValid():
            return QVariant()
        node = index.internalPointer()
        if role == Qt.DisplayRole:
            return QVariant(node.text)
        return QVariant()


app = QApplication(sys.argv)
treemodel = TreeModel()
treeview = QTreeView()
treeview.setSelectionMode(QTreeView.ExtendedSelection)
treeview.setSelectionBehavior(QTreeView.SelectRows)
treeview.setModel(treemodel)
treeview.expandAll()
treeview.show()
cProfile.run('app.exec_()', 'profdata')
p = pstats.Stats('profdata')
p.sort_stats('time').print_stats()

To reproduce the problem, just run the code (which does profiling) and select all nodes in the tree widget (either through shift selection or Cmd-A). When you quit the app, the profiling stats will show something like:

Fri May  8 20:04:26 2009    profdata

         628377 function calls in 6.210 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    4.788    4.788    6.210    6.210 {built-in method exec_}
   136585    0.861    0.000    1.182    0.000 /Users/hsoft/Desktop/slow_selection.py:34(parent)
   142123    0.217    0.000    0.217    0.000 {built-in method createIndex}
    17519    0.148    0.000    0.164    0.000 /Users/hsoft/Desktop/slow_selection.py:52(data)
   162198    0.094    0.000    0.094    0.000 {built-in method isValid}
     8000    0.055    0.000    0.076    0.000 /Users/hsoft/Desktop/slow_selection.py:26(index)
   161357    0.047    0.000    0.047    0.000 {built-in method internalPointer}
       94    0.000    0.000    0.000    0.000 /Users/hsoft/Desktop/slow_selection.py:46(rowCount)
      404    0.000    0.000    0.000    0.000 /Users/hsoft/Desktop/slow_selection.py:43(columnCount)
       94    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    6.210    6.210 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

The weird part in this data is how often parent() is called: 136k times for 2k nodes! Anyone has a clue why?

Virgil Dupras
  • 2,634
  • 20
  • 22

2 Answers2

3

Try calling setUniformRowHeights(true) for your tree view:

https://doc.qt.io/qt-4.8/qtreeview.html#uniformRowHeights-prop

Also, there's a C++ tool called modeltest from qt labs. I'm not sure if there is something for python though:

https://wiki.qt.io/Model_Test

ekhumoro
  • 115,249
  • 20
  • 229
  • 336
Mark Beckwith
  • 1,942
  • 1
  • 18
  • 22
  • Thanks for the hint, but unfortunately, it didn't help. It did reduce the number of parent calls, but only to 134k calls. As for Modeltest, it seems interesting, but I don't know how to import 3rd party C++ components in PyQt (I'll have to google it up). But in any case, it seems to me that this model is correct, isn't it? – Virgil Dupras May 10 '09 at 14:25
0

I converted your very nice example code to PyQt5 and ran under Qt5.2 and can confirm that the numbers are still similar, i.e. inexplicably huge numbers of calls. Here for example is the top part of the report for start, cmd-A to select all, scroll one page, quit:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   14.880   14.880   15.669   15.669 {built-in method exec_}
   196712    0.542    0.000    0.703    0.000 /Users/dcortes1/Desktop/scratch/treeview.py:36(parent)
   185296    0.104    0.000    0.104    0.000 {built-in method createIndex}
    20910    0.050    0.000    0.056    0.000 /Users/dcortes1/Desktop/scratch/treeview.py:54(data)
   225252    0.036    0.000    0.036    0.000 {built-in method isValid}
   224110    0.034    0.000    0.034    0.000 {built-in method internalPointer}
     7110    0.020    0.000    0.027    0.000 /Users/dcortes1/Desktop/scratch/treeview.py:28(index)
And while the counts are really excessive (and I have no explanation), notice the cumtime values aren't so big. Also those functions could be recoded to run faster; for example in index(), is "if not self.nodes" ever true? Similarly, notice the counts for parent() and createIndex() are almost the same, hence index.isValid() is true much more often than not (reasonable, as end-nodes are much more numerous than parent nodes). Recoding to handle that case first would cut the parent() cumtime further. Edit: on second thought, such optimizations are "rearranging the deck chairs on the titanic".
user405
  • 579
  • 7
  • 13