0

I need to get the interval boders from cut() output. I found this question that suggests to use findInterval() but it does not work as expected if value of x is same as the upper border of cut(x). See here:

x <- 1:3
breaks <- c(min(x), 2, max(x))
interval <- findInterval(x, breaks)

data.frame(x,
           groups= cut(x, breaks, include.lowest= TRUE),
           x_lower= breaks[interval],
           x_upper= breaks[interval + 1],
           interval)

  x groups x_lower x_upper interval
1 1  [1,2]       1       2        1
2 2  [1,2]       2       3        2
3 3  [2,3]       3       NA       3

I am happy how cut() makes groups from x but x_lower and x_upper in row 2 and 3 are not as expected. In row two x is 2, groups is [1,2], so I expect x_lower to be 1 and x_upper to be 2. And in row 3 x is 3, groups is [2,3], so I expect x_lower to be 2 and x_upper to be 3. If you play around with data you will see that findinterval() returns lower and upper values of groups if the x value is same as the upper border value in groups. I want to avoid that. How can we achieve this?

Expected output

structure(list(x = 1:3, groups = structure(c(1L, 1L, 2L), .Label = c([1,2]", "(2,3]"), class = "factor"), x_lower = c(1, 1, 2), x_upper = c(2, 2, 3), interval = c(1, 1, 2)), class = "data.frame", row.names = c(NA, -3L))

Remark I do want to use findInterval() and I can not use labels[as.numeric(groups)] as suggested in another post of the question above. This is because in my situation x is sometime a numeric, sometime a Date/ POSIXct/ts/... vector, thus, using as.numeric() is not save for me.

LulY
  • 976
  • 1
  • 9
  • 24

0 Answers0