How do you efficiently return the order of an increasing index?

Question

I have the following index vector:

TestVec = rep(c(6,8,9,11,18), each = 10)

This reads c(6, 6, ..., 6, 8, 8, ..., 8, 9, 9, ..., 9, ...).

I would like to convert this vector into c(1, 1, ..., 1, 2, 2, ..., 2, 3, 3, ..., 3, ...)

Try

I have improvised a quick-and-dirty method, as follows:

sapply(TestVec, function(x) {which(x == unique(TestVec))})

This works fine, but this takes a lot of time in a large dataset.

Is there any efficient way to improve?

`cumsum(!duplicated(TestVec))` – Ronak Shah Nov 29 '18 at 12:27 — Ronak Shah, Nov 29 '18 at 12:27

score 1 · Accepted Answer · answered Nov 29 '18 at 12:21

1

match(TestVec, unique(TestVec))

answered Nov 29 '18 at 12:21

Zheyuan Li

score 1 · Answer 2 · answered Nov 29 '18 at 12:23

1

Another option:

as.numeric(as.factor(TestVec))
# [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5

answered Nov 29 '18 at 12:23

Julius Vainora

score 1 · Answer 3 · answered Nov 29 '18 at 12:24

1

Requiring data.table:

rleid(TestVec)

answered Nov 29 '18 at 12:24

tmfmnk

score 1 · Answer 4 · answered Nov 29 '18 at 12:28

1

Here is another one,

c(1, cumsum(diff(TestVec) != 0)) + 1

answered Nov 29 '18 at 12:28

Sotos

4 Answers4