5

I would like to estimate some panel data models in R using PLM package. Because of restricted knowledge in theory, I am strictly following the instructions from "econometrics academy" (code here). I customized that code with respect to my data (own dependant/independant variables), but did not change all other syntax/formulas.

Now here's the problem:

All models can be estimated and their results can also be summarized and interpreted except for the random effects model. Here I get the following error message:

Error in solve.default(crossprod(X.m)) : 
  system is computationally singular: reciprocal condition number = 9.57127e-023

Is there anybody who can give me a hint what this error does actually mean? What might be the underlying reason and how do I have to correct the code in order to get results?

Edit: To be more precise, here's the part of R code I used:

# read in data
mydata<- read.csv2("Panel.csv")
attach(mydata)

# define dependant variable
    sd1 <- cbind(sd)

# define independant variable
x <- cbind(ratio1, ratio2, ratio3, ratio4, mean)

# Set data as panel data
pdata <- plm.data(mydata, index=c("id","t"))

# Pooled OLS estimator
pooling <- plm(sd1 ~ x, data=pdata, model= "pooling")
summary(pooling)

# Between estimator
between <- plm(sd1 ~ x, data=pdata, model= "between")
summary(between)

# First differences estimator
firstdiff <- plm(sd1 ~ x, data=pdata, model= "fd")
summary(firstdiff)

# Fixed effects or within estimator
fixed <- plm(sd1 ~ x data=pdata, model= "within")
summary(fixed)

# Random effects estimator
random <- plm(sd1 ~ x, data=pdata, model= "random")
summary(random)

Due to policy restrictions I am not allowed to upload data. But I can provide the information that it is balance sheet data. The dependant variable is a standard deviation of a balance sheet position over time which should be explained by different balance sheet positions. These are mainly ratios of the type "position a / mean" (ratios 1 to 4). As additional independent variable the average sum of the assets on the blanace sheet is considered.

Again: Actually everything works only the last model (random) produces the stated error.

Eventually the problem might be caused by the definition of the ratios? They are defined using the variable "mean" (which is also itself an independant variable)?

Edit: Traceback-Code

> random <- plm(sd1 ~ x, data=pdata, model= "random")
Error in solve.default(crossprod(X.m)) : 
  system is computationally singular: reciprocal condition number = 1.65832e-022
> traceback()
8: solve.default(crossprod(X.m))
7: solve(crossprod(X.m))
6: diag(solve(crossprod(X.m)) %*% crossprod(X.sum))
5: swar(object, data, effect)
4: ercomp.formula(formula, data, effect, method = random.method)
3: ercomp(formula, data, effect, method = random.method)
2: plm.fit(formula, data, model, effect, random.method, inst.method)
1: plm(sd1 ~ x, data = pdata, model = "random")
landroni
  • 2,902
  • 1
  • 32
  • 39
user3405263
  • 51
  • 1
  • 3
  • Perhaps I do have to add that I am dealing with unbalanced panel data:. Summary from pooling-estimation says: n=16, T=18-40, N=455 – user3405263 Mar 11 '14 at 09:07
  • 1
    Welcome to StackOverflow! Could you please paste here the **piece** of code that raises this error? With a little data it would be even better to make it [reproducible](http://stackoverflow.com/a/5963610/2886003). Thanks – llrs Mar 11 '14 at 09:55
  • 3
    The error message means that the matrix `X.m` whose inner product the algorithm tries to invert (`solve(crossprod(X.m))`) does not have full rank -- there are (at least numerically) linear dependencies between the columns of `X.m`. This could be due to: 1) trying to estimate more parameters than you have observations 2) perfect correlation between some of the columns. Without a reproducible example or at least a `traceback()` on the error it's hard to say more. – fabians Mar 12 '14 at 08:35
  • I edited my question and added the traceback-code output. – user3405263 Mar 13 '14 at 08:00

1 Answers1

1

If your model.matrix contrains very large values as well as very small values, solve might not be able to solve the system of linear equations by computation. Thus, have a look at model.matrix(sd1 ~ x, data=pdata) if this is the case. If so, try rescaling some variables (e.g. multiply oder divide by 100 oder 1000 [also log() makes sense sometimes). Take care, the interpretation of the coefficients changes due to the change of scales!

Helix123
  • 3,502
  • 2
  • 16
  • 36