2

I was attempting to number rows of data by group, something I have done many times. However, this time I obtained an error. Eventually I realized the error apparently was because the number of rows in each group was the same. Why does that cause an error?

This code works:

my.data <- read.table(text = '
     refno  cov1   cov2
     1111      a      1
     2222      b     -2
     3333      c      3
     4444      d      4
     5555      a      1
     6666      b      2
     7777      c      3
', header = TRUE, stringsAsFactors = FALSE)

# duplicate rows
n.times <- c(2,3,2,2,2,2,2)

my.data2 <- my.data[rep(seq_len(nrow(my.data)), n.times),]

# number rows by refno
my.seq <- data.frame(rle(my.data2$refno)$lengths)

my.data2$first <- unlist(apply(my.seq, 1, function(x) seq(1,x)))
my.data2$last  <- unlist(apply(my.seq, 1, function(x) seq(x,1,-1)))
my.data2

However, this code does not work. The only difference is I changed the 3 in n.times <- c(2,3,2,2,2,2,2) to 2.

my.data <- read.table(text = '
     refno  cov1   cov2
     1111      a      1
     2222      b     -2
     3333      c      3
     4444      d      4
     5555      a      1
     6666      b      2
     7777      c      3
', header = TRUE, stringsAsFactors = FALSE)

# duplicate rows
n.times <- c(2,2,2,2,2,2,2)

my.data2 <- my.data[rep(seq_len(nrow(my.data)), n.times),]

# number rows by refno
my.seq <- data.frame(rle(my.data2$refno)$lengths)

my.data2$first <- unlist(apply(my.seq, 1, function(x) seq(1,x)))
my.data2$last  <- unlist(apply(my.seq, 1, function(x) seq(x,1,-1)))
my.data2

I guess I can use the approach here to number rows in the second case: Sequentially numbering many rows blocks of unequal length

Although, I am still trying to figure out how to number the rows in reverse order in the second case.

Community
  • 1
  • 1
Mark Miller
  • 12,483
  • 23
  • 78
  • 132

1 Answers1

1

When you have all 2s, apply is returning an array rather than a list. This is because apply returns an array if each call to FUN is a vector of length n. However, this was working for you in the past because your call to apply was returning vectors of varying lengths. In this case, apply will return a list.

See the VALUE section of the documentation for apply https://stat.ethz.ch/R-manual/R-patched/library/base/html/apply.html

You can fix this by using lapply on the return value of rle.

my.seq <- rle(my.data2$refno)$lengths
my.data2$first <- unlist(lapply(my.seq, function(x) seq(1,x)))
my.data2$last  <- unlist(lapply(my.seq, function(x) seq(x,1,-1)))
my.data2

    refno cov1 cov2 first last
1    1111    a    1     1    2
1.1  1111    a    1     2    1
2    2222    b   -2     1    2
2.1  2222    b   -2     2    1
3    3333    c    3     1    2
3.1  3333    c    3     2    1
4    4444    d    4     1    2
4.1  4444    d    4     2    1
5    5555    a    1     1    2
5.1  5555    a    1     2    1
6    6666    b    2     1    2
6.1  6666    b    2     2    1
7    7777    c    3     1    2
7.1  7777    c    3     2    1
pcantalupo
  • 2,212
  • 17
  • 27