When attempting to implement a python function for calculating the coefficient of determination, R², I noticed I got wildly different results depending on whose calculation sequence I used.
The wikipedia page on R² gives a seemingly very clear explanation as to how R² should be calculated. My numpy interpretation of what is being said on the wiki page is the following:
def calcR2_wikipedia(y, yhat):
# Mean value of the observed data y.
y_mean = np.mean(y)
# Total sum of squares.
SS_tot = np.sum((y - y_mean)**2)
# Residual sum of squares.
SS_res = np.sum((y - yhat)**2)
# Coefficient of determination.
R2 = 1.0 - (SS_res / SS_tot)
return R2
When I try this method with a target vector y and a vector of modeled estimates yhat, this function produces an R² value of -0.00301.
However, the accepted answer from this stackoverflow post discussing how to calculate R², gives the following definition:
def calcR2_stackOverflow(y, yhat):
SST = np.sum((y - np.mean(y))**2)
SSReg = np.sum((yhat - np.mean(y))**2)
R2 = SSReg/SST
return R2
Using that method with the same y and yhat vectors as before, I now get an R² of 0.319.
Moreover, in the same stackoverflow post, a lot people of seem to be in favor of calculating the R² with the scipy module like this:
import scipy
slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(yhat, y)
R2 = r_value**2
Which in my case produces 0.261.
So my question is: why are R² values produced from seemingly well-accepted sources radically different from each other? And what is the correct way of calculating R² between two vectors?