2

This is a follow-up question related to this post, which in my opinion has not resolved the issue.

So I repeat the data

============================================
 year | comp | count |  value.x |  value.y
------+------+-------+----------+-----------
 2000 |   A  |  USA  |  1029.0  |  239481   
------+------+-------+----------+-----------
 2000 |   A  |  CAN  |  2341.4  |  129333   
------+------+-------+----------+-----------
 2000 |   B  |  USA  |  2847.7  |  187319   
------+------+-------+----------+-----------
 2000 |   B  |  CAN  |  4820.5  |  392039
------+------+-------+----------+-----------
 2001 |   A  |  USA  |  7289.9  |  429481
------+------+-------+----------+-----------
 2001 |   A  |  CAN  |  5067.3  |  589143
------+------+-------+----------+-----------
 2001 |   B  |  USA  |  7847.8  |  958234
------+------+-------+----------+-----------
 2001 |   B  |  CAN  |  9820.0  | 1029385
============================================

Although from the programming point of view, some answers in that post do the job the issue is far from complete.

My question is more specific.

I want to run a fixed effect and a random effect model based on the data shown above. What I is to study the effects of value.x to value.y across comp and year, regadless (or controlling for count)

The suggested answer provided in this post to handle duplicates in ID is as follows:

fakedata$id <- fakedata %>% group_indices(comp, count)

and then run

plm(value.y ~ value.x, model = "within", data=fakedata, index=c(id,year))

although grouping the comp and count and then run the fixed effects or random effects model works, this strategy assumes that each comp is treated differently in each cou. This is not necessarily what someone wants from such regressions.

As said before, in my case I want to know the effects of value.x to value.y across comp and year, regadless (or controlling for count)

I think this suits a model of the following form:

plm(value.y ~ value.x + factor(cou), model = "within", data=fakedata, index=c(as.numeric(comp),year))

As was suggested in some answers. However this did not work for me and the usual error message from plm packadge appears:

Error in pdim.default(index[[1]], index[[2]]) : 
  duplicate couples (id-time)
In addition: Warning messages:
1: In pdata.frame(data, index) :
  duplicate couples (id-time) in resulting pdata.frame
 to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")
2: In is.pbalanced.default(index[[1]], index[[2]]) :
  duplicate couples (id-time)

So, how I can do the fixed effect model without interacting comp and cou ??

msh855
  • 1,493
  • 1
  • 15
  • 36
  • For fixed effect you can run a simple regression treating year and comp as dummy variables. Random effects is a GLS model (coefficients are the same as OLS, you need adjust the covariance matrix), then it is possible to something as you need, however under the RE effects model assumptions, individual countries characteristics should be irrelevant, otherwise your model is biased. – Rodrigo Remedio Jun 04 '18 at 23:09
  • Thank you, any example with codes? How I can introduce in FE model the dummy variables you said ? How different is from what I wrote in my post ? `plm(value.y ~ value.x + factor(cou), model = "within", data=fakedata, index=c(as.numeric(comp),year))` – msh855 Jun 05 '18 at 06:50

1 Answers1

1

For fixed effect you can run a simple regression treating year and comp as dummy variables. In this case you will obtain fixed effects regadless of country.

lm(value.y ~ value.x + factor(year) + factor(comp)

You can even include country as factor. I this case you will obtain the same result as running plm with the id index (to obtain exactly, the same results you may have to mannually chosse the dummies, so they are the same in both regressions). This is LSDV (Least-Squares Dummy Variables) model.

lm(value.y ~ value.x + factor(year) + factor(comp) + factor(count)
plm(value.y ~ value.x, model = "within", data=fakedata, index=c(id,year))

If you want to choose which factor is the base factor, you can create your dummies by hand. There are thousand packages that can do it. Two examples:

#with base R
dummies <- model.matrix(~cate, fakedata)

#or using the fastDummies package
dummies <- dummy_columns(fakedata$cate)

In your question, the following code will throw an error because plm can't deal with repeated id. However, supposing it could, the fact you use count as dummy, would give tou the same result creating index which interacts count and comp. By your question, I guess this is not what you want.

plm(value.y ~ value.x + factor(cou), model = "within", data=fakedata, index=c(as.numeric(comp),year))
Rodrigo Remedio
  • 640
  • 6
  • 20
  • There is also the function `make.dummies` in `plm` to create contrast-coded dummies from a factor. The help ( `?plm::make.dummies`) shows how this function can be used for LSDV models. – Helix123 Mar 06 '22 at 08:55