# SAMPLE DATA
myDF <- structure(list(V2 = c(100L, 200L, 300L, 100L), V3 = c(NA, 300L, NA, 400L)), .Names = c("V2", "V3"), class = "data.frame", row.names = c("AA", "BB", "CC", "DD"))
Assuming myDf
is your original data frame
# create columns sequence
Columns <- seq(100, 400, by=100)
newMat <- sapply(Columns, function(c) rowSums(c==myDF, na.rm=T))
# assign names
colnames(newMat) <- Columns
newMat
# 100 200 300 400
# AA 1 0 0 0
# BB 0 1 1 0
# CC 0 0 1 0
# DD 1 0 0 1
Explanation:
c == myDF
gives a matrix of TRUE/FALSE values.
If you perform arithmetic on T/F, they are treated as 1/0
Thus, we can take the rowSum()
for each row AA, BB, etc.
which will tell us how many times each row is equal to c.
We use sapply
to iterate over each column value, 100, 200, etc.
lapply
returns for us a list
sapply
, takes that list and simplifies it into a nice matrix.
we then clean up the names to make things pretty.