TensorFlow calls each of the inputs to a softmax a logit. They go on to define the softmax's inputs/logits as: "Unscaled log probabilities."
Wikipedia and other sources say that a logit is the log of the odds, and the inverse of the sigmoid/logistic function. I.e., if sigmoid(x) = p(x), then logit( p(x) ) = log( p(x) / (1-p(x)) ) = x.
Is there a mathematical or conventional reason for TensorFlow to call a softmax's inputs "logits"? Shouldn't they just be called "unscaled log probabilities"?
Perhaps TensorFlow just wanted to keep the same variable name for binary logistic regression (where it makes sense to use the term logit) and categorical logistic regression...
This question was covered a little bit here, but no one seemed bothered by the use of the word "logit" to mean "unscaled log probability".