Subset all rows before negative value in group

Question

I have data.table

X = data.table(x = c(1,1,1,1,1,2,2,2,2,2), y = c(3,2,1,-1,5,7,4,-2,3,5))

I want to subset only rows which are above negative values in one group:

res = data.table(x = c(1,1,1,2,2), y = c(3,2,1,7,4)

From five values in first group, I want to get only first three, because fourth is negative, and the same with second group.

You mean: X[y>=0] ? Your example seems to have mistake in it — Buggy, Feb 23 '16 at 10:35
@user3293236 I want to subset rows before negative value. I can have non-negative values after it, but I don't want to subset them. Look at my result and compare with your code. The result is not the same. — Vitaliy Radchenko, Feb 23 '16 at 10:38
Standard reference here: http://stackoverflow.com/q/16573995/1191259 — Frank, Feb 23 '16 at 18:21

talat · Accepted Answer · 2016-02-23T10:50:59.830

6

Here are two options:

X[, .SD[seq_len(which.max(y<0)-1L)], by = x]

Or (perhaps more efficient because it avoids .SD):

X[ X[, .I[seq_len(which.max(y<0)-1L)], by = x]$V1 ]

edited Feb 23 '16 at 10:50

answered Feb 23 '16 at 10:44

talat

1

+1. Really have to get to issue [#613](https://github.com/Rdatatable/data.table/issues/613): Optimize `.SD[i]` query to keep the elegance but make it faster unchanged – Matt Dowle Feb 23 '16 at 18:57

score 1 · Answer 2 · answered Feb 23 '16 at 18:16

1

We may also do

X[, .SD[cummin(sign(y))>0], x]
#   x y
#1: 1 3
#2: 1 2
#3: 1 1
#4: 2 7
#5: 2 4

answered Feb 23 '16 at 18:16

akrun

2 Answers2