With a big data.table, we need to take a quick glance of it from time to time, especially when looking for some patterns. But it's difficult to see through at the first sight with a long table ahead.
I would like to ask, can we just get first few rows by each group? Something like DT[1:5, .(X,Y), by=.(Z)]
?
Here is the ref.code.
library(data.table)
DT <- data.table(X=rep(letters[1:3], 50), Y=rep(LETTERS[1:3],each=50))[,Z:=paste0(X,"-",Y)]
setkey(DT,Z)
DT[,.(X,Y), by=.(Z)]
# Z X Y
# 1: a-A a A
# 2: a-A a A
# 3: a-A a A
# 4: a-A a A
# 5: a-A a A
# ---
# 146: c-C c C
# 147: c-C c C
# 148: c-C c C
# 149: c-C c C
# 150: c-C c C
and I would like to make it this way:
# Z X Y
# 1: a-A a A
# 2: a-A a A
# 3: a-A a A
# 4: a-A a A
# 5: a-A a A
# 6: b-B a B
# 7: b-B a B
# 8: b-B a B
# 9: b-B a B
# 10: b-B a B
# 11: c-C c C
# 12: c-C c C
# 13: c-C c C
# 14: c-C c C
# 15: c-C c C
just pick out the first few (say,5) rows, so the details won't be folded up. Please advise, Thanks
--------
To express my gratitude to the community, here is my conclusion based on @Frank and many other friends' feedback.
Per the other same topic post, people suggest to generate an index, then produce target table. here are the steps:
- make index by
.I[]
which I'd never heard of.
DT[,.I[1:2], by=Z]
> DT[,.I[1:2], by=Z]
Z V1
1: a-A 1
2: a-A 2
3: a-B 18
4: a-B 19
5: a-C 35
6: a-C 36
7: b-A 51
8: b-A 52
9: b-B 68
10: b-B 69
11: b-C 84
12: b-C 85
13: c-A 101
14: c-A 102
15: c-B 117
16: c-B 118
17: c-C 134
18: c-C 135
The V1 is the index! So, I can grab and use it.
- Insert the index in to the original DT, and BINGO
That's cool
> DT[DT[,.I[1:2], by=Z]$V1]
X Y Z
1: a A a-A
2: a A a-A
3: a B a-B
4: a B a-B
5: a C a-C
6: a C a-C
7: b A b-A
8: b A b-A
9: b B b-B
10: b B b-B
11: b C b-C
12: b C b-C
13: c A c-A
14: c A c-A
15: c B c-B
16: c B c-B
17: c C c-C
18: c C c-C
Thanks, Guys