Insert a row before each group of data using data.table

Question

This maybe a silly question. But I want to insert a role before each group of data. All I can find online is how to insert a row after each group. Also, the inserted row will have a sequence number represent each ID. For example, i have a data table like

df 
ID     TIME   VAR VALUE
101     07/02   V1     9
101     07/03   V2     NA
101     07/03   V3     10
221     06/01   V1     2
221     07/03   V2     4

I want something like:

ID     TIME   VAR     VALUE
101       NA   sequence   1
101     07/02   V1        9
101     07/03   V2        NA
101     07/03   V3        10
221       NA   sequence   2
221     06/01   V1        2
221     07/03   V2        4

The real table have around 14 millions rows with 14,000 unique ID. Is there a quick way to update the table? Thank you!

`df[, rbind(.(write_your_tuple_here), .SD), by=ID]` ? There is no way to update a table to add/remove rows without creating a new table yet. https://stackoverflow.com/q/10790204/ In your case, it looks like you have used `melt` on a wide table.. if so, it'd probably be easier to add the extra column in that format before melting. — Frank, Aug 03 '17 at 15:07
Hi Frank, the first option is not working, or I may put the wrong thing inside (write_your_tuple_here). For the melt and extra column, I am not sure how it can be down. Do you might explain it a little bit more? Thank you! — VeraShao, Aug 03 '17 at 15:17
Hm, not sure. Is it the case that every ID has one row for V1, one for V2, one for V3? I meant `.(TIME = NA_character_, VAR = "sequence", VALUE = .GRP)` as your tuple, or whatever the rule is for the inserted row. (.GRP is sequence number, see `?.GRP`.) — Frank, Aug 03 '17 at 15:20
No, it is not that case that every ID will only have one V1,V2,and V3. There may be multiple of them, as well as V4,V5... I am trying to modify the tuple, no success yet. But it is nice to know the .GRP function. — VeraShao, Aug 03 '17 at 15:30
Ok. Fwiw, here's an example where the `rbind` approach works: `DT = data.table(ID= 1:2, v = 3:4, v2 = 5:6); DT[, rbind(.(v = 0, v2 = .GRP), .SD), by=ID]` — Frank, Aug 03 '17 at 15:33
@Frank, It works now. I modify it a little bit. The .GRP is a amazing function. Thank you so much! — VeraShao, Aug 03 '17 at 15:41
Np, glad it worked out. Feel free to post an answer with whatever worked for you. — Frank, Aug 03 '17 at 15:44
@Frank. Another small question. If I want to change the TIME from NA to the earliest time minus one day. How can I do it? I am trying using TIME= df$TIME[1] -1. But It seems not working. — VeraShao, Aug 03 '17 at 16:31
You shouldn't need to use `df$` inside a data.table. In this case, I guess, `first(TIME) - 1`. Anyways, you'll want to read the vignettes soon to get a handle on syntax like this, I guess. — Frank, Aug 03 '17 at 16:37

Clayton Stanley · Answer 1 · 2017-08-04T23:14:37.900

Might be faster as it does the rbind in bulk:

> tbl[, .SD
      ][, .N, ID
      ][, VALUE := .I
      ][, VAR := 'sequence'
      ][, N := NULL
      ][, rbind(.SD, tbl, fill=T)
      ][order(ID, VAR != 'sequence', TIME)
      ][, .(ID, TIME, VAR, VALUE)
      ]

    ID  TIME      VAR VALUE
1: 101    NA sequence     1
2: 101 07/02       V1     9
3: 101 07/03       V2    NA
4: 101 07/03       V3    10
5: 221    NA sequence     2
6: 221 06/01       V1     2
7: 221 07/03       V2     4
>

Insert a row before each group of data using data.table

1 Answers1