0
library(data.table)
dataHAVE=data.frame("student"=c(1,2,3),
                    "score" = c(10,11,12),
                "count"=c(4,1,2))


dataWANT=data.frame("student"=c(1,1,1,1,2,3,3),
                    "score"=c(10,10,10,10,11,12,12),
                    "count"=c(4,4,4,4,1,2,2))

setDT(dataHAVE)dataHAVE[rep(1:.N,count)][,Indx:=1:.N,by=student]

I have data 'dataHAVE' and seek to produce 'dataWANT' that basically copies each 'student' 'count' number of times as shown in 'dataWANT'. I try doing this as shown above in data.table as this is the solution I seek but get error

Error: unexpected symbol in "setDT(dat)dat"

and I cannot resolve thank you so much.

chinsoon12
  • 25,005
  • 4
  • 25
  • 35
bvowe
  • 3,004
  • 3
  • 16
  • 33

2 Answers2

1

Try:

setDT(dataHAVE)[rep(1:.N,count)]

Output:

   student score count
1:       1    10     4
2:       1    10     4
3:       1    10     4
4:       1    10     4
5:       2    11     1
6:       3    12     2
7:       3    12     2

As explained you could also replace 1:.N and do setDT(dataHAVE)[dataHAVE[, rep(.I, count)]].

Just FYI, there's also a nice function in tidyr that does similar thing:

tidyr::uncount(dataHAVE, count, .remove = FALSE)
arg0naut91
  • 14,574
  • 2
  • 17
  • 38
0

Here is a base R solution

dataWANT<-do.call(rbind,
                  c(with(dataHAVE,rep(split(dataHAVE,student),count)),
                    make.row.names = FALSE))

such that

> dataWANT
  student score count
1       1    10     4
2       1    10     4
3       1    10     4
4       1    10     4
5       2    11     1
6       3    12     2
7       3    12     2
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81