I have a list of about 561 elements, each of which is a list that looks like a matrix when called. Below is an example from the dataset,
structure(list(`111110` = structure(c(205, 4, 1, 6, 23, 0, 1,
0, 0), .Dim = c(3L, 3L), .Dimnames = list(c("1", "4", "5"), c("1",
"4", "5"))), `111120` = structure(c(181, 3, 4, 4), .Dim = c(2L,
2L), .Dimnames = list(c("1", "4"), c("1", "4"))), `111130` = structure(c(71, 8, 3, 15, 114, 7, 6, 8, 56), .Dim = c(3L, 3L), .Dimnames = list(
c("1", "4", "5"), c("1", "4", "5"))), `111140` = structure(c(87,
8, 9, 14), .Dim = c(2L, 2L), .Dimnames = list(c("1", "4"), c("1",
"4"))), `111150` = structure(24, .Dim = c(1L, 1L), .Dimnames = list(
"1", "1")), `111160` = structure(48, .Dim = c(1L, 1L), .Dimnames = list(
"1", "1"))), .Names = c("111110", "111120", "111130", "111140",
"111150", "111160"))
The dimensions of each element in the list are 1 x 1 to 6 x 6. I would like to do the following calculations for each of the elements in the list:
if the entry has a column named "5", then I would like to sum the entries in column "5", except the entry in the last row of column "5". If there is no column "5" then the calculation should be blank.
if the entry has a column named "5", sum elements in column "1", except the first element. If the associated entry does not have a column with "5" as its header it should be blank.
take the calculations in part 1 and 2 and add them to a data frame containing the unique id and the calculations from 1 and 2.
I have tried the following (based on the answer provided below):
output <- c()
for(x in names(trans.by.naics)) {
id <- x
count.entry.5 <- ifelse("5" %in% colnames(trans.by.naics[[x]]),
sum(trans.by.naics[[x]][1 :nrow(trans.by.naics[[x]]), 5]) - trans.by.naics[[x]][5,5], "") # sum down the first four rows of column "5" if it exists
count.entry.1 <- ifelse("5" %in% colnames(trans.by.naics[[x]]),
sum(trans.by.naics[[x]][1 : nrow(trans.by.naics[[x]]), 1]) - trans.by.naics[[x]][1,1], "")
thing <- data.frame(id, count.entry.5, count.entry.1)
output <- rbind(output, thing)
}
But I get the following when I run my code:
Error in trans.by.naics[[x]][1:nrow(trans.by.naics[[x]]), 5] :
subscript out of bounds
The desired output looks like this:
id count.entry.5 count.entry.1
1 111110 1 5
2 111120 3
3 111130 14 11
4 111140
5 111150
6 111160
Is there a good way to do this that won't take too long? Perhaps a more vectorized approach? An lapply
approach? Any advice or help is appreciated. Thanks!!