3

I would like to find the closest ranges that do not overlap from the first start to the last end position. Any idea how to proceed? In the example below c(8, 33) and c(155, 161) should be filtered out because they overlap with the preceding range.

#Example data
df <- data.frame(
  start = c(7,8,14,34,67,92,125,155,170,200),
  end = c(13,33,25,66,91,124,155,161,181,214)
)

   start end
1      7  13
2      8  33
3     14  25
4     34  66
5     67  91
6     92 124
7    125 155
8    155 161
9    170 181
10   200 214

#Overlapping rows
  start end
1     8  33
2   155 161

#Desired output where overlapping rows are filtered away
  start end
1     7  13
2    14  25
3    34  66
4    67  91
5    92 124
6   125 155
7   170 181
8   200 214
Nivel
  • 629
  • 4
  • 12

3 Answers3

2

I would do this as a simple loop, since whether a row is excluded depends on the result of the calculation for the previous row

i <- 2

while(i < nrow(df)) {
  if(df$start[i] <= df$end[i - 1]) {
    df <- df[-i,] 
  } else { 
    i <- i + 1
  }
}

df
#>    start end
#> 1      7  13
#> 3     14  25
#> 4     34  66
#> 5     67  91
#> 6     92 124
#> 7    125 155
#> 9    170 181
#> 10   200 214
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
0

Since your start column has been in the ascending order, you can check the overlap via the values of end only, e.g.,

repeat {
  ind <- with(df, head(which(!c(TRUE,end[-nrow(df)]<start[-1])),1))
  if (!length(ind)) break
  df <- df[-ind,]
}

which gives

> df
   start end
1      7  13
3     14  25
4     34  66
5     67  91
6     92 124
7    125 155
9    170 181
10   200 214
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
0

I went with the following answer posted on the R community website:

find_nonover <- function(df) {
  to_drop <- logical(nrow(df))
  for (i in seq_along(df[["end"]])) {
    if (i %in% which(to_drop)) next
    to_drop <- to_drop | c(logical(i), df[i, "end"] >= df[["start"]][-seq_len(i)])
  }
  list(nonover = df[!to_drop, ],
       over    = df[to_drop, ])
}

https://community.rstudio.com/t/find-closest-non-overlapping-ranges-from-start-to-end/79642/3

Nivel
  • 629
  • 4
  • 12