1

I have no idea what the correct title here is that will make it easy for other to find this later on...

I am reporting on quality of data that is submitted to me. For example, I report on missing data where this not allowed. I refer to the actual rows which have missing data. Sometimes there is a lot of missing data is so long that my QA report is pages long. I would like to know how to shorten numeric vectors.

How do I reduce:

x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 55, 56, 57)


paste0("Missing data: rows ", paste0(x, collapse = ", "))
[1] "Missing data: rows 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 55, 56, 57"

to:

"Missing data: rows 1:10, 20:30, 55:57"

Luc
  • 899
  • 11
  • 26

1 Answers1

3

One option is to split the vector by creatng a grouping column based on the difference of adjacent elements, convert to logical, taking the cumsum of the logical vector, and then loop over the list with sapply, paste the range (or min/max values), and finally paste with the prefix string

paste0("Missing data: rows ", toString(sapply(split(x, 
   cumsum(c(TRUE, diff(x) != 1))), function(x)  paste0(min(x), ":", max(x)))))
#[1] "Missing data: rows 1:10, 20:30, 55:57"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Perfect! I see the Q got closed as it was already answered somewhere else. I guess that confirms my lack of giving a good title... i didnt know what words to search for! Others might have the same issue though. – Luc Jun 10 '20 at 23:07