Does pytorch Dataset.getitem have to return a dict?

Question

EDIT: This is not about the general __getitem__ method but the usage of __getitem__ in the Pytorch Dataset-subclass, as @dataista correctly states.

I'm trying to implement the usage of Pytorchs Dataset-class. The guide e.g here is really good, but I struggle to figure out Pytorch requirements for the return value of __getitem__. In the Pytorch documentation I cannot find anything about what it should return; is it any object which is iterable with size 2 e.g [sample,target], (sample,target)? In some guides they return a dict, but they do not specify if it has to be a dict which is returned.

The return value can be anything (not necessarily dict or tuple). — akshayk07, May 06 '21 at 10:46
You can return whatever you want, In most cases you just return the data and the targets for example like this ```return images, targets``` — Theodor Peifer, May 06 '21 at 11:56
@CutePoison `__getitem__` in PyTorch DataSets _is_ the general `__getitem__` - there is nothing special about it. — iacob, May 06 '21 at 13:37
This is not a duplicate question, since Pytorch's dataset `__getitem__` has some particularities. The duplicate flag should be removed. — dataista, Oct 12 '21 at 16:46
This is indeed a question that should be answered on its own. The pytorch framework requires one to implement the `__getitem__` method as part of the abstract `Dataset` base class. As such, it is a valid question _what exactly is the contract of `__getitem__` within the pytorch framework_? The linked answer doesn't address that at all. It is just a coincidence that pytorch doesn't seem to have any / much requirements on the return type. — bluenote10, Dec 28 '22 at 15:54

score 2 · Answer 1 · answered May 06 '21 at 13:46

PyTorch has no requirements on the return value of a DataSet's __getitem__ method. It can be anything, but you will commonly encounter a tensor, a tuple of tensors, a dictionary (e.g. {'features':..., 'label':...}) etc.

It is usual in 2d data to return a single tensor whose final column are the target values, but equally you may see tuples/dicts of the features and targets explicitly separated.

Note there is no requirement that you return two values - in many unsupervised contexts (e.g. autoencoders) there is only a set of features, with no distinct target.

Does pytorch Dataset.getitem have to return a dict?

1 Answers1

Linked

Does pytorch Dataset.__getitem__ have to return a dict?

1 Answers1

Linked

Does pytorch Dataset.getitem have to return a dict?