Allowing an estimated parameter in MLE to vary per country-year

Question

I have tried to replicate a formula from the paper:

Layard, R., Nickell, S., & Mayraz, G. (2008). The marginal utility of income. Journal of Public Economics, 92(8–9), 1846–1857. https://doi.org/10.1016/j.jpubeco.2008.01.007

The part I want to estimate is listed below:

I started out as follows:

#################################################################################################
# Data
#################################################################################################

library(data.table)
library(bbmle)
library(dummies)
set.seed(1)
TDT <- data.table(panelID = sample(50,50),                                                    # Creates a panel ID
                  yct = c(rep("Albania",30),rep("Belarus",50), rep("Chilipepper",20)),       
                  some_NA = sample(0:5, 6),                                             
                  some_NA_factor = sample(0:5, 6),         
                  Group = c(rep(1,20),rep(2,20),rep(3,20),rep(4,20),rep(5,20)),
                  Time = rep(seq(as.Date("2010-01-03"), length=20, by="1 month") - 1,5),
                  norm = round(runif(100)/10,2),
                  Income = round(rnorm(10,-5,5),2),
                  Happiness = sample(10,10),
                  Sex = round(rnorm(10,0.75,0.3),2),
                  Age = sample(100,100),
                  Educ = round(rnorm(10,0.75,0.3),2))           
TDT[, yi:= .I]                                                                        #     
TDT[TDT == 0] <- NA                                                                            # https://stackoverflow.com/questions/11036989/replace-all-0-values-to-na
TDT $some_NA_factor <- factor(TDT$some_NA_factor)
TDT$yct <- as.factor(TDT$yct)
TDT <- cbind(TDT, dummy(TDT$yct, sep = "_"))

#################################################################################################
# MLE
#################################################################################################
start_rho <- c(1,1.2,1.4,1.6,1.8,2)
mu_Happiness <- mean(TDT$Happiness, na.rm=TRUE)
sd_Happiness <- sd(TDT$Happiness, na.rm=TRUE)

LL4 <- function(p, a, mu, sigma) {
  -sum(dnorm(TDT$Happiness - a*((TDT$Income^(1-p)-1)/(1-p)) + TDT$Educ + TDT$TDT_Albania  + TDT$TDT_Belarus+ TDT$TDT_Chillipepper, mu, sigma, log=TRUE))
}

mle = list()
mle_sum = list()
for (i in 1:length(start_rho)) {
  tryCatch({
    mle[[i]] <- mle2(LL4, start = list(p = start_rho[[i]], sigma=sd_Happiness, mu=mu_Happiness, a=1) , fixed = NULL, method = "BFGS") # fixed = list(mu = 6.6)
    mle_sum[[i]] <- summary(mle[[i]])
    print(i)
    print(start_rho[[i]])
    print (mle_sum[[i]])
    if (i==1000) stop("N is to large")
    }, error=function(e){})
}

However, according to the paper I should allow for the estimated parameter alpha to vary per country-year.

How should I implement this into the equation?

`TDT` isn't defined. I recommend using the `reprex` package to ensure the code is cut-and-paste reproducible. — DHW, Oct 19 '19 at 12:35
Where's the function `dummy()` coming from? There's one in `hablar` but it doesn't have a `sep` argument. — DHW, Oct 19 '19 at 12:37
That is the `dummies` package. And DT and TDT are the same, sorry. — Tom, Oct 19 '19 at 12:43
No worries! I am getting the following error in trying to reproduce, though: `Error in TDT$Happiness - a * ((TDT$Income^(1 - p) - 1)/(1 - p)) + TDT$Educ + : non-numeric argument to binary operator`. (This error is being suppressed by the `tryCatch`.) — DHW, Oct 19 '19 at 12:50

DHW · Accepted Answer · 2019-10-19T13:28:10.243

I'm unable to reproduce the code, so I can't give you a working example, but, one way or another, you're going to have to estimate a large number of other parameters, whether it's in addition to α or in place of it. Country needs to get dummied out, each dummy (excepting the base category) needs to be included as a term, and each term needs its own coefficient, representing either α in that country (if α is omitted) or the amount by which that country's α deviates from the base α, whether as an additive difference or as a coefficient on the base α itself. Naturally, for any given observation, all of these country terms except 1 (or all of them if it's the base country) will be multiplied by a dummy value of 0, and thus drop out.

This is to say nothing of the year. The theory is underspecified, actually. So, the phrasing is: "[α] can vary between countries or at different time points." Note that it's not "and/or". This implies that the country and year variances do not interact with each other, in which case we don't need to multiply the country dummies by year, or some transformation of year. But even so, what relationship between α and year are we supposed to test for, or at least entertain? Linear-additive? Or must we dummy out each year? And if we do the latter, and the effects of country and time do interact, well, now we have to include, as separate estimated parameters, the effect of each country in each year. And that will be a lot of dummies. So, by my interpretation, there's a very wide range of estimable models here, and the theory's verbiage is insufficient to marry it up with just one of them. Its equation 6 there definitely doesn't accord with the verbiage that follows.

If you ever do get to coding out some version of that fuller model, though, I recommend the nls package, and maybe recipes for doing all that dummying-out and interacting. Obviously, hand-coding each term is impractical.

One last thought: you could avoid all that country dummying if, and only if, you could pin the theory down regarding what about a given country is supposed to be explaining the country-level variance. If α varies by country because countries vary by, say, sunniness, then measure sunniness and parameterize that, which will be far easier to do if you can treat it as continuous. But, of course, there's an epidemic in social theory where inter-society variance is acknowledged but not explained, and so we're stuck with models that basically say "the Frenchiness of France leads to an x-point difference in the effect on happiness."

Thank you for your extensive answer, and sorry for the horrible code. I thought it worked on my computer, but it was just reporting a previous estimation stored. It works now (however for some reason only with all country variables (now "dummied out") included. With respect to your comments; I have tried to dummy out each country and give each of those countries a separate coefficient. However, I guess because of the multitude of unknown parameters, it does not converge.If I understand you correctly, that is actually my simplest option.. Will look at the other packages! — Tom, Oct 19 '19 at 14:09

Allowing an estimated parameter in MLE to vary per country-year

1 Answers1