2

Is it possible to extract extra columns form a data.table while grouping it and selecting the minimum value?

DT <- data.table(events)
firstOrders <- as.data.frame(DT[,min(property_time),by=property_.uid])

In this example, the orderids (this is a column in the events-df) should be extracted, so the ordernumber where time is minimum in the userid-group.

Matthias Adriaens
  • 332
  • 1
  • 4
  • 17

1 Answers1

3

I guess we want to get rows that have the min value of 'property_time', grouped by 'property_.uid'. In that case, we can use which.min to get the numeric index and use that to subset the dataset (.SD).

 DT[,.SD[which.min(property_time)],by=property_.uid]

A faster option would be to get the row index with .I and then subset the dataset

 i1 <- DT[,.I[which.min(property_time)],by=property_.uid]$V1
 DT1 <- DT[i1]

data

set.seed(25)
DT <- data.table(property_.uid=rep(1:3, each=3), 
   property_time=sample(1:15, 9, replace=TRUE), OtherCol=rnorm(9))
akrun
  • 874,273
  • 37
  • 540
  • 662