2

Pandas documentation for df.items() says;

Iterate over (column name, Series) pairs.

The exact same definition can be found for df.iteritems() as well. Both seem to be doing the same thing.

However, I was curious whether there is any difference between these two, as there is between dict.items() and dict.iteritems() according to this SO question. Apparently, dict.items() created a real list of tuples (in python2) potentially taking a lot of memory while dict.iteritems() returns a generator.

Is this the case with df.items() and df.iteritems()? Is df.iteritems() faster for dataframes having a large number of columns?

Achintha Ihalage
  • 2,310
  • 4
  • 20
  • 33

2 Answers2

4

They are exactly the same, there is no difference. You can see the source code of both here. The iteritems method is really this (except for the type hints and doc decorator):

def iteritems(self):
    yield from self.items()
Mustafa Aydın
  • 17,645
  • 4
  • 15
  • 38
0

The main difference between the two methods is that df.items() returns a list of tuples, while df.iteritems() returns a generator. This means that df.items() will take up more memory, as it needs to store the entire list of tuples in memory. df.iteritems(), on the other hand, will only store a small amount of data in memory at a time, which makes it more efficient for large DataFrames.

Understanding generators in Python

Tzane
  • 2,752
  • 1
  • 10
  • 21
bhakti
  • 1