0

I want to use a DiD (Difference in differences) regression-based approach to measure the treatment effect based on two variables: tt1 (time and treatment dummy) on lexptot (expenditure), using the plm function in the plm library:

reg5<-plm(formula=lexptot~tt1+treat98+year,data=d1,model="fd")

Based on the error message their is an issue with duplicates of id-time in my data frame:

1: In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")
2: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)

How can I remove the duplicates?

Valentin Ruano
  • 2,726
  • 19
  • 29
zepmaya
  • 47
  • 1
  • 9
  • Without [reproducible example](https://stackoverflow.com/q/5963269/1861328) it is difficult to answer your question. Please provide a minimal data set, which illustrates your problem. – utubun Jun 21 '18 at 18:46
  • The data set is based on household surveys panel data in Bangladesh from the years 1991 and 1998. Im trying to evaluate the impact of participation in a microcredit program on household´s expenditure. It is containing various variables, among others: expenditure per household `lexptot`,household which participated in the program `treat98`(binary), `year` of observation and `tt1` a time and treatment dummy created by `d1$tt1<-(d1$treat98*d1$year)`. Each household has an unique ID. – zepmaya Jun 21 '18 at 22:46

0 Answers0