I'm a newer user of R and understand how to make my code work but I know there has to be a dplyr or purrr function that does this more efficiently and with a lot less code? If there is I haven't found it yet. My PI wants a summation of our race data but the trick is to have it separated by one race and then if they answered more than one race the sum breakdown of those. I did a subset of the data to get just those columns and then added the columns individually in each row and output that to a new matrix 7x7 to get sums of each.
This is my code. My question is there a much more efficient way of doing this?
-sum races to create totaled matrix of all races
subset <- subset(dataset[,11:17])
test <- matrix(,nrow=7, ncol=7)
colnames(test) <- c("African_American", "Asian", "Hawaiian_Pacific", "Native_Alaskan", "White_Euro", "Hispanic_Latino", "No-Answer")
rownames(test) <- c("African_American", "Asian", "Hawaiian_Pacific", "Native_Alaskan", "White_Euro", "Hispanic_Latino", "No-Answer")
-basic design of "if ==1 then strictly one race. If >1 stick in appropriate category
test[1,1] <- sum(subset$African_American==1, na.rm=TRUE)
test[1,2] <- sum(subset$African_American+subset$Asian>1, na.rm=TRUE)
test[1,3] <- sum(subset$African_American+subset$Hawaiian_Pacific>1, na.rm=TRUE)
test[1,4] <- sum(subset$African_American+subset$Native_Alaskan>1, na.rm=TRUE)
test[1,5] <- sum(subset$African_American+subset$White_Euro>1, na.rm=TRUE)
test[1,6] <- sum(subset$African_American+subset$Hispanic_Latino>1, na.rm=TRUE)
test[1,7] <- sum(subset$African_American+subset$`No-Answer`>1, na.rm=TRUE)
test[2,1] <- sum(subset$Asian+subset$African_American>1, na.rm=TRUE)
test[2,2] <- sum(subset$Asian==1, na.rm=TRUE)...
There are seven columns to add to each other so it moves all the way through the matrix and outputs something similar to this where the diagonal are actual counts of only one race and the others are multiple occurrences: matrix