3

I'm interested in identifying contiguous regions within a matrix (not necessarily square) of 0-1 (boolean) values, using R. I would like, given a matrix of 0-1 values, to identify each contiguous cluster (diagonals count, although an option of whether to count them or not would be ideal) and register the number of cells within that cluster.

Take the following example:

set.seed(14)
p <- matrix(0, ncol = 10, nrow = 10)
p[sample(1:100, 10)] <- 1
ones <- which(p == 1)
image(p)

Image of Plot

I'd like to be able to identify (since I'm counting diagonals) four different groups, with (from top to bottom) 2, 1, 5, and 2 cells per cluster.

The raster package has an adjacent function which does a good job of locating adjacent cells, but I can't figure out how to do this.

One last constraint is that an ideal solution should be fast. I'd like to be able to use it within a data.table dt[, lapply(.SD, ...)] type situation with a large number of groups (each group being a data set from which I could create the matrix).

mbarete
  • 399
  • 2
  • 17
  • Are you interested in [*creating neighbours*](https://cran.r-project.org/web/packages/spdep/vignettes/nb.pdf)? – Konrad Jun 16 '16 at 18:48
  • 1
    Duplicate here, http://stackoverflow.com/q/35772846/604456. See the `clump` function in the raster package. – Andy W Jun 16 '16 at 19:00
  • @AndyW if you write that as an answer I'll accept and mark it closed – mbarete Jun 17 '16 at 14:49
  • 1
    I will flag it to be closed as a duplicate. Better for the system than to have the same answer roaming around. – Andy W Jun 17 '16 at 15:01

1 Answers1

0

You definitely need connected component labeling algorithm

enter image description here

MBo
  • 77,366
  • 5
  • 53
  • 86