In my simulation study I need to come up with a covariance matrix for multivariate data. My data:
dataset=data.frame(observation=rep(1:8,2),plot=rep(1:4,each=2),time=rep(1:2,8),treatment=rep(c("A","B","A","B"),each=4),OutputVariable=rep(c("P","Q"),each=8))
This dataset is multivariate, for every observation (1:8) there is more than one result. In this case, we observe a value for OutputVariable P and for OutputVariable Q at the same time. Note that actual outputs are not in this dataset as I will generate them at a later stage.
The desired Covariance Matrix would be 16x16. Where CovarMat[2,9]
indicates the Covariance between the second line (Observation 2 of variable P) and the 9th line (Observation 1 of variable Q) in the dataset.
The value of, for instance, CovarMat[2,9] is based on rules like these:
CovarMat[2,9]=0
- If
dataset$plot[2]==dataset$plot[9]
thenCovarMat[2,9]=CovarMat[2,9]+1.5
- If
dataset$time[2]==dataset$time[9]
thenCovarMat[2,9]=CovarMat[2,9]+1.5
- If
(dataset$plot[2]==dataset$plot[9])&(dataset$time[2]==dataset$time[9])
thenCovarMat[2,9]=CovarMat[2,9]+3
- If
abs(dataset$time[2]-dataset$time[9])=1
thenCovarMat[2,9]=CovarMat[2,9]+2
Using For-loops thats easy enough (and thats what I did up to now). But my current dataset is 13,200 lines. And thus my CovarMat consists of 174,240,000 cells. Therefore, I am in desperate need of a more efficient way.