4

Suppose we have the following database:

ID  Shoot  hit
1     10    2
1      9    3
1      8    1
2     10    8
2      8    8
2     11   10
2      7    2
3      9    2
4      6    6
4      6    5
.
.

And I would like to have it with numbers assigned in each group, in this case per ID such as:

ID Shoot hit number.in.group
1   10     2    1
1    9     3    2
1    8     1    3
2   10     8    1
2    8     8    2 
2   11    10    3
2    7     2    4
3    9     2    1
4    6     6    1
4    6     5    2
    .
    .

I could do it easily using a loop. Something like these would work:

df$number.in.group = rep(1,nrow(df))

for(i in 2:nrow(df))
    if(df$ID[i]==df$ID[i-1]){
     df$number.in.group[i] = df$number.in.group[i-1] + 1 }  

My question is, is there any function or more elegant way of doing this other than using a loop?

aatrujillob
  • 4,738
  • 3
  • 19
  • 32
  • We don't generally worry about the dates when marking questions as duplicates. There are more, higher-quality answers on the other question. – zwol Feb 05 '18 at 16:28

8 Answers8

7

If you want a one-liner, something like

df$number.in.group = unlist(lapply(table(df$ID),seq.int))
Simon Urbanek
  • 13,842
  • 45
  • 45
7

You could just use rle and sequence:

dat <- read.table(text = "ID  Shoot  hit
+ 1     10    2
+ 1      9    3
+ 1      8    1
+ 2     10    8
+ 2      8    8
+ 2     11   10
+ 2      7    2
+ 3      9    2
+ 4      6    6
+ 4      6    5",sep = "",header = TRUE)

> sequence(rle(dat$ID)$lengths)
 [1] 1 2 3 1 2 3 4 1 1 2

Indeed, I think sequence is intended for exactly this purpose.

joran
  • 169,992
  • 32
  • 429
  • 468
6
> dat$number.in.group <- ave(dat$ID,dat$ID, FUN=seq_along)
> dat
   ID Shoot hit number.in.group
1   1    10   2               1
2   1     9   3               2
3   1     8   1               3
4   2    10   8               1
5   2     8   8               2
6   2    11  10               3
7   2     7   2               4
8   3     9   2               1
9   4     6   6               1
10  4     6   5               2
IRTFM
  • 258,963
  • 21
  • 364
  • 487
4

Using dplyr

dat <- data.frame(ID = rep(1:3, c(2, 3, 5)), val = rnorm(10))

library(dplyr)
dat %>% group_by(ID) %>%
    mutate(number.in.group = 1:n())
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
2

There are probably better ways but one could use tapply on the IDs and toss in a function that returns a sequence.

# Example data
dat <- data.frame(ID = rep(1:3, c(2, 3, 5)), val = rnorm(10))

# Using tapply with a function that returns a sequence
dat$number.in.group <- unlist(tapply(dat$ID, dat$ID, function(x){seq(length(x))}))
dat

which results in

> dat
   ID          val number.in.group
1   1 -0.454652118               1
2   1 -2.391824247               2
3   2  0.530832021               1
4   2 -1.671043812               2
5   2 -0.045261549               3
6   3  2.311162484               1
7   3 -0.525635803               2
8   3  0.008588811               3
9   3  0.078942033               4
10  3  0.324156111               5
Dason
  • 60,663
  • 9
  • 131
  • 148
2
df$number.in.group <- unlist(lapply(as.vector(unlist(rle(df$ID)[1])), function(x) 1:x))
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
1

Here's another solution

require(plyr)
ddply(dat, .(ID), transform, num_in_grp = seq_along(hit))
Ramnath
  • 54,439
  • 16
  • 125
  • 152
0

I compared your anwsers and IShouldBuyABoat is the most promissing. I found that function ave could be applied even if dataset is not sorted according to the grouping variable.

Let consider dataset:

dane<-data.frame(g1=c(-1,-2,-2,-2,-3,-3,-3,-3,-3),
             g2=c('reg','pl','reg','woj','woj','reg','woj','woj','woj'))

Joran anwser and applied to my example:

> sequence(rle(as.character(dane$g2))$lengths)
[1] 1 1 1 1 2 1 1 2 3

Simon Urbanek proposition and results:

> unlist(lapply(table(dane$g2),seq.int))
  pl reg1 reg2 reg3 woj1 woj2 woj3 woj4 woj5 
   1    1    2    3    1    2    3    4    5 

IShouldBuyABoat code gives correct anwser:

> as.numeric(ave(as.character(dane$g1),as.character(dane$g1),FUN=seq_along))
[1] 1 1 2 3 1 2 3 4 5
Maciej
  • 3,255
  • 1
  • 28
  • 43