I have a problem where I get information on the range of occupied cells. There may be multiple start and end entries of the range which can overlap for the same test. Not all the "test" have entries. I have a data frame in R and want to merge all the unique ranges for each "test".
x<-data.frame(test=c(2,3,3,2,3,4),start=c(1,1,1,2,3,4),end=c(1,2,3,3,4,4))
> x
test start end
1 2 1 1
2 3 1 2
3 3 1 3
4 2 2 3
5 3 3 4
6 4 4 4
I would like to transform this data frame into:
test start end
1 2 1 1
2 2 2 3
3 3 1 4
4 4 4 4
In the end I just want to know how many cells are occupied by the range for each "row", so row 2 has (1,1) and (2,3) which means 3 cells. row 3 has (1,4) so 4 cells. row 4 has (4,4) so 1 cell. since row 1 or 5 to n has none occupied, all are 0 cells:
u<-unique(y[,1])
a<-rep(0,length(u))
for(i in 1:length(u)){
a[i]<-sum(y[which(y[,1]==u[i]),3]-y[which(y[,1]==u[i]),2])+length(which(y[,1]==u[i]))
}
> a
[1] 3 4 1