I am trying to get the top frequency words in a data.table
data.table : dtable4G
key freq value
================================
thanks for the 612 support
thanks for the 380 drink
thanks for the 215 payment
thanks for the 27 encouragement
have a great 154 day
have a great 132 weekend
have a great 54 week
have a great 42 time
have a great 19 night
at the same 346 time
at the same 57 damn
at the same 30 pace
at the same 11 speed
at the same 7 level
at the same 1 rate
I tried the code
dtable4G[ , max(freq), by = key]
and
dtable4G[ , .I[which.max(freq)] , by = key]
Both the above commands, I am getting the same result:
key V1
====================
thanks for the 612
have a great 154
at the same 346
I want the result to be:
key freq value
================================
thanks for the 612 support
have a great 154 day
at the same 346 time
Any ideas what I am doing wrong?
EDITED
dtable4G[dtable4G[, .I[which.max(freq)], by = key]$V1]
worked for me. Though it took some time to run through my 5.4 mil rows.
But this was way faster than using
dtable4G[,.SD[which.max(freq)],by=key]
Reference: With data.table, is SD[which.max(Var1)] the fastest way to find the max of a group?