2

I'm trying to find out the exact formula used in H2O for the Mean Residual Deviance loss function for a Tweedie distribution.

Or even, in general, what would be the mean residual deviance for a Tweedie distributed dependent variable?

So far, I've found this page (http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/glm.html#tweedie-models) where the deviance formula for a tweedie distribution is given as:

Tweedie deviance in H2O documentation

However, inside the H2O code, found on github on this page line 103 (https://github.com/h2oai/h2o-3/blob/master/h2o-core/src/main/java/hex/Distribution.java#L103) the formula is specified differently (ignoring the omega, which is just the weight, and the lack of summation):

2 * w * (Math.pow(y, 2 - tweediePower) / ((1 - tweediePower) * (2 - tweediePower)) - y * exp(f * (1 - tweediePower)) / (1 - tweediePower) + exp(f * (2 - tweediePower)) / (2 - tweediePower))

which in equation form is:

Tweedie Deviance used in the code

So, is the documentation wrong or the implementation? I would appreciate any help!

Thank you!

vosirus
  • 113
  • 1
  • 6
  • quick note, always look at the master branch: https://github.com/h2oai/h2o-3/blob/master/h2o-core/src/main/java/hex/Distribution.java if you are comparing the latest stable documentation to the source code. Please also post the exact equation you are referring to in the documentation since the link you have is to an entire section. Also please specify why you linked to the particular tweedie case and not others in the source code. – Lauren Nov 27 '18 at 00:54
  • Thank you, Lauren. I've updated the question. I'm speaking about Tweedie, because I am specifically interested in the Tweedie case, as I'm working with a model where Tweedie distribution is used and wanted to get a clearer understanding. Thank you! – vosirus Nov 28 '18 at 16:10
  • Similar discrepancies are actually present for the Gamma and Poisson deviance equations, too. The Poisson Deviance is generally defined as: 2 * w * (y * log(y / y_hat) - (y - y_hat) ). However, the code in the link above on line 96 is: -2 * w * (y * log(y_hat) - y_hat) – vosirus Nov 28 '18 at 17:18
  • I've found the correct equations on a different source page: https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glm/GLMModel.java#L381 This is for a GLM algorithm. I'm not sure, though, whether same is used for GBM and I'm still confused why the equations on the source page linked in the question are different :/ – vosirus Nov 28 '18 at 18:11

1 Answers1

1

Thank you for pointing this out, while the backend equation located here is correct (so the implementation is correct), the equation in the documentation appears to be incorrect. I have created this Jira ticket to update the equation in the documentation. The ticket contains the correct equation along with helpful information to derive it.

Lauren
  • 5,640
  • 1
  • 13
  • 19