My data looks like this:
DF <- structure(list(Gene = c("GeneA", "GeneB", "GeneC", "GeneD", "GeneE"),
region = c("1:5914103-1:7245590","1:27403851-1:30161281","1:27403851-1:30161281","1:27403851-1:30161281","1:34800556-1:37548572")),
.Names = c("Gene","region"),
row.names = c(NA, 5L),
class = "data.frame")
> DF
Gene region
GeneA 1:5914103-1:7245590
GeneB 1:27403851-1:30161281
GeneC 1:27403851-1:30161281
GeneD 1:27403851-1:30161281
GeneE 1:34800556-1:37548572
I am wanting to create a new column (clump) in my datafame (DF) which summarizes another column (region) by cumulatively counting the groups in that column (region), so that it would look like this:
> DF
Gene region clump
GeneA 1:5914103-1:7245590 1
GeneB 1:27403851-1:30161281 2
GeneC 1:27403851-1:30161281 2
GeneD 1:27403851-1:30161281 2
GeneE 1:34800556-1:37548572 3
As this seemed like a fairly intuitive question, I have had a prolonged trawl through stackoverflow in search of an existing answer, and have seen similar questions, but they have lacked the component about cumulatively counting (i.e. other questions have asked about counting the number of rows in groups or unique instances of other columns within groups, rather than just reclassifying the group in a cumulative manner). So I apologize in advance if there is in fact a duplicate of this question out there.
Thanks for any help!