Questions tagged [goodness-of-fit]

Goodness of fit tests indicate whether or not it is reasonable to assume that a random sample comes from a specific distribution.

"They are a form of hypothesis testing where the null and alternative hypotheses are:

H0: Sample data come from the stated distribution
HA: Sample data do not come from the stated distribution

These tests are sometimes called *omnibus tests."

Reference:

Ricci, V. (2005). Fitting distributions with R. page 16.


Tag Usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique.
Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

123 questions
19
votes
3 answers

How to perform a chi-squared goodness of fit test using scientific libraries in Python?

Let's assume I have some data I obtained empirically: from scipy import stats size = 10000 x = 10 * stats.expon.rvs(size=size) + 0.2 * np.random.uniform(size=size) It is exponentially distributed (with some noise) and I want to verify this using a…
metakermit
  • 21,267
  • 15
  • 86
  • 95
15
votes
5 answers

Significant mismatch between `r2_score` of `scikit-learn` and the R^2 calculation

Question Why is there a significant difference between the r2_score function in scikit-learn and the formula for the Coefficient of Determination as described in Wikipedia? Which is the correct one? Context I'm using with Python 3.5 to predict…
Juan Carlos Coto
  • 11,900
  • 22
  • 62
  • 102
13
votes
2 answers

Chi-squared goodness of fit test in R

I have a vector of observed values and also a vector of values calculated with model: actual <- c(1411,439,214,100,62,38,29,64) expected <- c(1425.3,399.5,201.6,116.9,72.2,46.3,30.4,64.8) Now I'm using the Chi-squared goodness of fit test to see…
AliCivil
  • 2,003
  • 6
  • 28
  • 43
6
votes
1 answer

Goodness-of-fit for fixed effect logit model using 'bife' package

I am using the 'bife' package to run the fixed effect logit model in R. However, I cannot compute any goodness-of-fit to measure the model's overall fit given the result I have below. I would appreciate if I can know how to measure the…
Eric
  • 528
  • 1
  • 8
  • 26
5
votes
2 answers

Cross-validation gives Negative R2?

I am partitioning 500 samples out a 10,000+ row dataset just for sake of simplicity. Please copy and paste X and y into your IDE. X = array([ -8.93, -0.17, 1.47, -6.13, -4.06, -2.22, -2.11, -0.25, 0.25, 0.49, 1.7 , -0.77, …
5
votes
3 answers

Is there an Anderson-Darling implementation for python that returns p-value?

I want to find the distribution that best fit some data. This would typically be some sort of measurement data, for instance force or torque. Ideally I want to run Anderson-Darling with multiple distributions and select the distribution with the…
5
votes
0 answers

Is there any solution for better fit beta prime distribution to data than using Scipy?

I was trying to fit beta prime distribution to my data using python. As there's scipy.stats.betaprime.fit, I tried this: import numpy as np import math import scipy.stats as sts import matplotlib.pyplot as plt N = 5000 nb_bin = 100 a = 12; b =…
Fay
  • 105
  • 1
  • 5
5
votes
3 answers

Assesing the goodness of fit for the multinomial logit in R with the nnet package

I use the multinom() function from the nnet package to run the multinomial logistic regression in R. The nnet package does not include p-value calculation and t-statistic calculation. I found a way to calculate the p-values using the two tailed…
Koba
  • 1,514
  • 4
  • 27
  • 48
4
votes
0 answers

The gof function from package btergm gives AUC value of a precision-recall greater than 1

I was trying to do out-of-sample prediction using the gof function from package btergm. When calculating the AUC value of a precision-recall curve from the testing set, I get the result of 1.012909, which seems to be theoretically impossible. How…
4
votes
1 answer

Calculating goodness of fit and rmsea from factor_analyser in python

I am performing Confirmatory factor analysis in python using the factor_analyzer module. I have searched hi and low for a way to generate the model diagnostics such as the Root Mean Square Error of Approximation, the chi square, the CFI and…
KevOMalley743
  • 551
  • 3
  • 20
4
votes
2 answers

GoodnessOfFit.StandardError wrong answer

Why am I getting the wrong answer (err2) from GoodnessOfFit.StandardError? In the code below, I do the computation myself and get the right answer (err3). I get the right answer from GoodnessOfFit.RSquared. Note: esttime and phrf are double[].…
phv3773
  • 487
  • 4
  • 10
4
votes
1 answer

Very low p-values in Python Kolmogorov-Smirnov Goodness of Fit Test

I have a set of data and fit the corresponding histogram by a lognormal distribution. I first calculate the optimal parameters for the lognormal function, and then plot the histogram and the lognormal function. This gives quite good results: import…
4
votes
1 answer

Chi-squared goodness of fit test in Python: way too low p-values, but the fitting function is correct

Despite having searched for two day in related questions, I have not really found an answer to this Problem yet... In the following code, I generate n normally distributed random variables, which are then represented in a histogram: import numpy…
Charles M.
  • 83
  • 1
  • 4
4
votes
2 answers

How do you perform a goodness of link test for a generalized linear model in R?

I'm working on fitting a generalized linear model in R (using glm()) for some data that has two predictors in full factorial. I'm confident that the gamma family is the right error distribution to use but not sure about which link function to use…
DirtStats
  • 559
  • 9
  • 29
4
votes
1 answer

Goodness of fit in CCA in R

The following are the datasets mm <- read.csv("https://stats.idre.ucla.edu/stat/data/mmreg.csv") colnames(mm) <- c("Control", "Concept", "Motivation", "Read", "Write", "Math", "Science", "Sex") psych <- mm[, 1:3] # dataset A acad <- mm[, 4:8] #…
Paul
  • 1,077
  • 3
  • 14
  • 27
1
2 3
8 9