I am using package Ineq in R to calculate Gini coefficent. From inspecting the source code (below), it is ordering vector x first before computing Gini.
Example data:
example_data = data.frame(SCORE_RANGE = c('100-200','201-300','301-
400','401-500','501-600'),
NUMBER_OF_OBSERVATIONS = c(100,100,100,100,100),
NUMBER_OF_NON_EVENT = c(85,90,95,90,90),
NUMBER_OF_EVENT = c(15,10,5,10,10))
Source code of Gini function from ineq package:
Gini = function (x, corr = FALSE, na.rm = TRUE)
{
if (!na.rm && any(is.na(x)))
return(NA_real_)
x <- as.numeric(na.omit(x))
n <- length(x)
x <- sort(x)
G <- sum(x * 1L:n)
G <- 2 * G/sum(x) - (n + 1L)
if (corr)
G/(n - 1L)
else G/n
}
I am doing this for my credit score models and I have binned data into score ranges of equal frequencies and then order by scores (smallest to largest).
Using Gini function from ineq package would give 0.16. Is this correct given this context and that Gini function from ineq package reorder the vector before computing? If not, what is the correct Gini coefficient should be?
Gini(example_data$NUMBER_OF_EVENT)