I need to implement Vuong's test for non-nested models. Specifically, I have logistic-regression models that I would like to compare.
I have found implementations in R and STATA online, but unfortunately I work in Python and am not familiar with those frameworks/languages. Also unfortunate is that I have not been unable to find a Python implementation of the test (if someone knows of one that would be fantastic!). I asked on Cross Validated and was redirected to here.
After scouring the original paper and a few other pages that I found (references below) I think I've managed to port the R implementation from the pscl
. Example code is below.
Is there someone who is fluent in both Python and R who could help confirm that the below code reproduces the R implementation? And, if not, help determine where I erred? In addition to helping with my immediate problem, perhaps this can help someone in the future.
References consulted:
- Original Vuong Paper. See page 318 in particular
- Paper on Misuse
- R
pscl
Implementation - Wikipedia
- Matlab User's Implementation
- Stata User's Implementation
Example code with current implementation and use for logistic regression fits of dummy data. Note that this is a stock dataset and the predictor variables were arbitrarily selected.
import numpy as np
import pandas as pd
import statsmodels.api as sm
from scipy.stats import norm
from sklearn.datasets import load_breast_cancer
def vuong_test(mod1, mod2, correction=True):
'''
mod1, mod2 - non-nested logitstic regression fit results from statsmodels
'''
# number of observations and check of models
N = mod1.nobs
N2 = mod2.nobs
if N != N2:
raise ValueError('Models do not have the same number of observations')
# extract the log-likelihood for individual points with the models
m1 = mod1.model.loglikeobs(mod1.params)
m2 = mod2.model.loglikeobs(mod2.params)
# point-wise log likelihood ratio
m = m1 - m2
# calculate the LR statistic
LR = np.sum(m)
# calculate the AIC and BIC correction factors -> these go to zero when df is same between models
AICcor = mod1.df_model - mod2.df_model
BICcor = np.log(N)*AICcor/2
# calculate the omega^2 term
omega2 = np.var(m)
# calculate the Z statistic with and without corrections
Zs = np.array([LR,LR-AICcor,LR-BICcor])
Zs /= np.sqrt(N*omega2)
# calculate the p-value
ps = []
msgs = []
for Z in Zs:
if Z>0:
ps.append(1 - norm.cdf(Z))
msgs.append('model 1 preferred over model 2')
else:
ps.append(norm.cdf(Z))
msgs.append('model 2 preferred over model 1')
# share information
print('=== Vuong Test Results ===')
labs = ['Uncorrected']
if AICcor!=0:
labs += ['AIC Corrected','BIC Corrected']
for lab,msg,p,Z in zip(labs,msgs,ps,Zs):
print(' -> '+lab)
print(' -> '+msg)
print(' -> Z: '+str(Z))
print(' -> p: '+str(p))
# load sample data
X,y = load_breast_cancer( return_X_y=True, as_frame=True)
# create data for modeling
X1 = sm.add_constant( X.loc[:,('mean radius','perimeter error','worst symmetry')])
X2 = sm.add_constant( X.loc[:,('mean area','worst smoothness')])
# fit the models
mod1 = sm.Logit( y, X1).fit()
mod2 = sm.Logit( y, X2).fit()
# run Vuong's Test
vuong_test( mod1, mod2)
Output
Optimization terminated successfully.
Current function value: 0.172071
Iterations 9
Optimization terminated successfully.
Current function value: 0.156130
Iterations 9
=== Vuong Test Results ===
-> Uncorrected
-> model 2 preferred over model 1
-> Z: -0.7859538767172318
-> p: 0.21594725450542168
-> AIC Corrected
-> model 2 preferred over model 1
-> Z: -0.8726045081599004
-> p: 0.19143934123980472
-> BIC Corrected
-> model 2 preferred over model 1
-> Z: -1.0608044994241501
-> p: 0.14438937850119132