I'm interested in identifying contiguous regions within a matrix (not necessarily square) of 0-1 (boolean) values, using R. I would like, given a matrix of 0-1 values, to identify each contiguous cluster (diagonals count, although an option of whether to count them or not would be ideal) and register the number of cells within that cluster.
Take the following example:
set.seed(14)
p <- matrix(0, ncol = 10, nrow = 10)
p[sample(1:100, 10)] <- 1
ones <- which(p == 1)
image(p)
I'd like to be able to identify (since I'm counting diagonals) four different groups, with (from top to bottom) 2, 1, 5, and 2 cells per cluster.
The raster
package has an adjacent
function which does a good job of locating adjacent cells, but I can't figure out how to do this.
One last constraint is that an ideal solution should be fast. I'd like to be able to use it within a data.table
dt[, lapply(.SD, ...)]
type situation with a large number of groups (each group being a data set from which I could create the matrix).