0
from typing import NamedTuple, List, Set, Tuple, Dict

class EmbeddingInfoStruct(NamedTuple):
    emb_names : list[str] =[]
    idx_in_data: list[int] =[]
    emb_dim: list[int] =[]

info1 =EmbeddingInfoStruct()
info1.emb_names.append("name1")

info2=EmbeddingInfoStruct()

print("info1 address = ", id(info1), ", info2 address = " ,id(info2))
print (info1)
print (info2)

output of print :

info1 address =  2547212397920 , info2 address =  2547211152576
EmbeddingInfoStruct(emb_names=['name1'], idx_in_data=[], emb_dim=[])
EmbeddingInfoStruct(emb_names=['name1'], idx_in_data=[], emb_dim=[])

Surprisingly info1 and info2 both share the same value. I'd expect info2.emb_names to be empty. Why does NamedTuple behaves like it's a "static class"?

imachabeli
  • 66
  • 8
  • 2
    Does this answer your question? ["Least Astonishment" and the Mutable Default Argument](https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument) – drum Nov 17 '22 at 16:07
  • This is a well-known gotcha in Python. – Jared Smith Nov 17 '22 at 16:08
  • @RandomDavis I wouldn't say it's unrelated. The problem is the same in both cases: the thing you think gets created on every call is actually just a reference to the same mutable value. – Jared Smith Nov 17 '22 at 16:09
  • Specifically, the metaclass used by `NamedTuple` (which is a *function*, despite it being used like a class) uses the values assigned to what otherwise look like class attributes in the same way a default argument value is used. – chepner Nov 17 '22 at 16:22
  • 1
    I would strive to use named tuples for data that really is immutable, even if the immutable fields of a tuple are allowed to be mutable values like lists. If something is supposed to be mutable, make it *look* mutable. – chepner Nov 17 '22 at 16:24

2 Answers2

1

I think you mistook NamedTuple from the typing module, describing the type of a named tuple for type hinting purpose, and the named tuple you can get from namedtuple() from the collection package (see the collection documentation).

Here, you are actually changing class member of your EmbeddingInfoStruct, thus the "static class" behavior.


Using this, your class declaration would rather look like

from collections import namedtuple
EmbeddingInfoStruct = namedtuple("EmbeddingInfoStruct",["emb_names", "idx_in_data", "emb_dim"],defaults=[list(),list(),list()])

info1 = EmbeddingInfoStruct()

You will, however, probably fall into the pitfall of "mutable" as default arguments, as explained there

J Faucher
  • 988
  • 6
  • 14
1

As said by others, the problem is the mutable default. You could use a dataclass with a field providing a default factory. See

https://docs.python.org/3/library/dataclasses.html#dataclasses.field

Carlos Horn
  • 1,115
  • 4
  • 17