-3

Folks,

Are there any example of the implementation of a simple softmax function for N values? I've seem things like "softmax-based detectors" and so forth, but I just want to see a pure, straightforward C++ softmax implementation.

Any examples you know of?

Thanks,

Pototo
  • 691
  • 1
  • 12
  • 27
  • 3
    You could *at least* have told us what you think a "softmax function" *is*. – Jesper Juhl Oct 19 '18 at 17:11
  • or google... https://codereview.stackexchange.com/questions/177973/softmax-function-implementation https://stackoverflow.com/questions/9906136/implementation-of-a-softmax-activation-function-for-neural-networks – OznOg Oct 19 '18 at 17:19
  • 1
    @OznOg - A question should be self-contained. Google should *not* be required. – Jesper Juhl Oct 19 '18 at 18:16
  • 1
    Sure it can be implemented in a number of ways. The implementation will depend heavily on how you're representing your data, which could be `vector`, `array`, some pointer array, or even some library-specific thing like TensorFlow. It would help you get a good answer if you showed how you're representing your problem, what you've already tried, and where _exactly_ you got stuck. – alter_igel Oct 19 '18 at 18:36
  • 1
    @JesperJuhl It was not a comment for you, rather for the asker who has responses to his question in SO already – OznOg Oct 19 '18 at 18:47

1 Answers1

3

I haven't seen a library implementation of softmax, although that's not proof that it doesn't exist. It's simple enough that people just write their own when they need it.

For the record, the softmax function on u1, u2, u3 ... is just the tuple (exp(u1)/Z, exp(u2)/Z, exp(u3)/Z, ...) where the normalizing constant Z is just the sum of the exponentials, Z = exp(u1) + exp(u2) + exp(u3) + ....

Note that adding or subtracting a constant from each u leaves the result unchanged, since it's equivalent to multiplying above and below by the same factor. So you could make the calculation a little more numerically well-behaved by subtracting the greatest value among the u's; then the largest term exp(u) will be 1 and all the others something smaller than that.

Robert Dodier
  • 16,905
  • 2
  • 31
  • 48