7

EDIT: This is not about the general __getitem__ method but the usage of __getitem__ in the Pytorch Dataset-subclass, as @dataista correctly states.

I'm trying to implement the usage of Pytorchs Dataset-class. The guide e.g here is really good, but I struggle to figure out Pytorch requirements for the return value of __getitem__. In the Pytorch documentation I cannot find anything about what it should return; is it any object which is iterable with size 2 e.g [sample,target], (sample,target)? In some guides they return a dict, but they do not specify if it has to be a dict which is returned.

CutePoison
  • 4,679
  • 5
  • 28
  • 63
  • The return value can be anything (not necessarily dict or tuple). – akshayk07 May 06 '21 at 10:46
  • 1
    I have made an edit to the question to clarify – CutePoison May 06 '21 at 10:50
  • You can return whatever you want, In most cases you just return the data and the targets for example like this ```return images, targets``` – Theodor Peifer May 06 '21 at 11:56
  • @CutePoison `__getitem__` in PyTorch DataSets _is_ the general `__getitem__` - there is nothing special about it. – iacob May 06 '21 at 13:37
  • 5
    This is not a duplicate question, since Pytorch's dataset `__getitem__` has some particularities. The duplicate flag should be removed. – dataista Oct 12 '21 at 16:46
  • This is indeed a question that should be answered on its own. The pytorch framework requires one to implement the `__getitem__` method as part of the abstract `Dataset` base class. As such, it is a valid question _what exactly is the contract of `__getitem__` within the pytorch framework_? The linked answer doesn't address that at all. It is just a coincidence that pytorch doesn't seem to have any / much requirements on the return type. – bluenote10 Dec 28 '22 at 15:54

1 Answers1

2

PyTorch has no requirements on the return value of a DataSet's __getitem__ method. It can be anything, but you will commonly encounter a tensor, a tuple of tensors, a dictionary (e.g. {'features':..., 'label':...}) etc.

It is usual in 2d data to return a single tensor whose final column are the target values, but equally you may see tuples/dicts of the features and targets explicitly separated.

Note there is no requirement that you return two values - in many unsupervised contexts (e.g. autoencoders) there is only a set of features, with no distinct target.

iacob
  • 20,084
  • 6
  • 92
  • 119