When using nn.softmax(), we use dim=1 or 0. Here dim=0 should mean row according to intuition but seems it means along the column. Is this true?
>>> x = torch.tensor([[1,2],[3,4]],dtype=torch.float)
>>> F.softmax(x,dim=0)
tensor([[0.1192, 0.1192],
[0.8808, 0.8808]])
>>> F.softmax(x,dim=1)
tensor([[0.2689, 0.7311],
[0.2689, 0.7311]])
Here when dim=0, probabilities along the columns sum to 1. Similarly when dim=1 probabilities along the rows sum to 1. Can someone explain how dim is used in PyTorch?