8

Apparently deleting entries in a dictionary doesn't trigger any resizes.

This can be seen from the following:

# Drastic example, nobody does such 
# things with dicts FWIK
from sys import getsizeof

d = {i:i for i in range(100)}
print(getsizeof(d))  # 4704
for i in range(100):
    del d[i]  # similarly with pop
print(getsizeof(d))  # 4704

as well as from a question on SO (from what I've found). sets behave in a similar fashion, which is to be expected to fall in line with what dicts do.

lists, on the other hand, resize when the the new size becomes half of that already allocated; this is stated in a list_resize comment:

/* Bypass realloc() when a previous overallocation is large enough
   to accommodate the newsize.  If the newsize falls lower than half
   the allocated size, then proceed with the realloc() to shrink the list.
*/

Why is it that dictionaries (and, indirectly, sets) don't employ a similar trick and instead wait for a new entry to be inserted? The behaviour described applies for Python 2.7 and 3.x.

wim
  • 338,267
  • 99
  • 616
  • 750
Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
  • `d.clear()` does resize, though – Jean-François Fabre Aug 02 '17 at 20:24
  • Yeah... the answer to the "why" would involve a bit of speculation - obviously the authors felt the benefits would not justify the complexity required to implement it and the performance impacts as a result of it. – cs95 Aug 02 '17 at 20:27
  • This is because of Amortisation. Changing the size only when there is a deletion of upto half elements and adding twice memory when previous memory allocation becomes full. It helps in reducing the overall cost of operation in allocating memory again and again. Since in list we have consecutive memory. So if we resize list on each insertion it will become too costly operation so if present allocated memory become full size of existing list is just doubled. Same stratergy is used in deletion. – Rajan Chauhan Aug 02 '17 at 20:44
  • 4
    @RajanChauhan: I think you've missed something. Both lists and dicts use a resize strategy that amortizes the cost of resizes, but dicts *don't* resize on deletion. The question is about why dicts don't resize on deletion, not about the amortization strategy. – user2357112 Aug 02 '17 at 20:50

1 Answers1

12

This is somewhat explained in Objects/dictnotes.txt, a companion file containing various notes on the dict implementation:

Dictionary operations involving only a single key can be O(1) unless resizing is possible. By checking for a resize only when the dictionary can grow (and may require resizing), other operations remain O(1), and the odds of resize thrashing or memory fragmentation are reduced. In particular, an algorithm that empties a dictionary by repeatedly invoking .pop will see no resizing, which might not be necessary at all because the dictionary is eventually discarded entirely.

One important consideration is that shrinking a list's buffer is really easy, while shrinking a dict's internal hash table is a much more complex operation.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • To add on to this, it seems evident that the authors hedged their bets on the fact that resizing on deletion would not have any tangible benefits unless a large amount of items were deleted in which case, it was assumed the dictionary would eventually be discarded. – cs95 Aug 02 '17 at 20:24
  • I glossed over the dictnotes and completely missed the *resize thrashing or memory fragmentation*, I need sleep. Thanks. As an aside: any idea what resize thrashing is exactly? – Dimitris Fasarakis Hilliard Aug 02 '17 at 20:29
  • 2
    @JimFasarakisHilliard: Resizing up and down over and over. Also an issue for lists, but they seem to have decided that the tradeoff went one way for dicts and the other way for lists, possibly due to the comparative expense of a dict resize or the ways they thought dicts and lists were used in practice. – user2357112 Aug 02 '17 at 20:31
  • Personally, I've always wondered how much practical relevance the empty-by-pop scenario actually has. – user2357112 Aug 02 '17 at 20:32
  • @user2357112 wouldn't it be about the same as empty a list by pop scenario? But it's unlikely that dict sizes would reach the same size as lists in the wild. – cowbert Aug 02 '17 at 20:38
  • Amortization is the word for it. – Rajan Chauhan Aug 02 '17 at 20:45