NOTE: this is not a time-serious problem that can be easily solved with zoo (or least, I do not understand how to zoo this problem :( )
I have a data set with a number of "key columns" and a value associated for combinations where only one of the key columns are set. vales for rows were more than one key column is set can be calculated based on the "one key column set" rows.
Using normal programming techniques, this is fairly simple (yet messy) to do as is shown below. I am hoping there is a nicer and more elegant way to do this in R.
In this example, I have three key's, for a combination key value e.g. [1,1,0] = I would calculate the value based on the two primary keys Val[1,0,0] and Val[0,1,0]. In this example where I use a simple mean this is mean(2,5) = 3.5.
myMatrix <- tribble(
~`1`, ~`2`, ~`3`, ~Val,
0,0,0,1,
1,0,0,2,
2,0,0,2,
0,1,0,5,
1,1,0,NA,
2,1,0,NA,
0,2,0,6,
1,2,0,NA,
2,2,0,NA,
0,0,1,1,
1,0,1,NA,
2,0,1,NA,
0,1,1,NA,
1,1,1,NA,
2,1,1,NA,
0,2,1,NA,
1,2,1,NA,
2,2,1,NA
)
#Filter for NA in the Val col
tmpNARows <- myMatrix %>% filter(is.na(Val)) %>% select(-Val)
#Take the
tmpFirstRow <- TRUE
for (myR in 1:nrow(tmpNARows)) {
#For each row in the NA table
tmpMyNARow<-tmpNARows[myR,]
tmpFirstElement <- TRUE
for (myC in 1:ncol(tmpMyNARow)) {
#find the records that make up this one's parts
#ignore columns with value 0
if (0 != tmpMyNARow[myC]) {
#Make Base Record for lookup
tmpMyBaseRow <- tmpMyNARow
for (myC2 in 1:ncol(tmpMyNARow)) {
if (myC2!=myC) { tmpMyBaseRow[myC2] <- 0 }
}
if(tmpFirstElement == TRUE) {
#Make a new Base table
tmpMyBaseTable <- tmpMyBaseRow
tmpFirstElement <- FALSE
} else {
#Append the Base row to the Base table
tmpMyBaseTable <- union(tmpMyBaseTable, tmpMyBaseRow)
}
}
}
#Calculate the mean and store in as Val
tmpVal <- (left_join(tmpMyBaseTable, myMatrix) %>% summarise(mean(Val)))[[1]]
tmpMyNARowWithVal <- tmpMyNARow %>% mutate(Val = tmpVal)
if (tmpFirstRow == TRUE) {
tmpMyResultMatrix <- tmpMyNARowWithVal
tmpFirstRow <- FALSE
} else {
tmpMyResultMatrix <- union(tmpMyResultMatrix,tmpMyNARowWithVal)
}
}
#filter for non NA
tmpNonNARows <- myMatrix %>% filter(!is.na(Val))
#Add the calculated rows
myCalculatedMatrix <- union(tmpNonNARows, tmpMyResultMatrix)
#lets have a look
myCalculatedMatrix
#the (1,1,0) element is indeed 3.5 so it appears to be working.
The expected result should look like
myCalculatedMatrix %>% arrange_all()
# A tibble: 18 x 4 `1` `2` `3` Val <dbl> <dbl> <dbl> <dbl> 1 0 0 0 1.000000 2 0 0 1 1.000000 3 0 1 0 5.000000 4 0 1 1 3.000000 5 0 2 0 6.000000 6 0 2 1 3.500000 7 1 0 0 2.000000 8 1 0 1 1.500000 9 1 1 0 3.500000 10 1 1 1 2.666667 11 1 2 0 4.000000 12 1 2 1 3.000000 13 2 0 0 2.000000 14 2 0 1 1.500000 15 2 1 0 3.500000 16 2 1 1 2.666667 17 2 2 0 4.000000 18 2 2 1 3.000000