1

The data set is mileage data from vehicles (it's in a data.table). There are many VINs (vehicle ID #'s) and many records per VIN. I am trying to flag the record where where the mileage (column named "mi") surpasses 400,000 for each vin. The end goal is to count how many vehicles pass the 400k mi threshold in each month.

The data table is named "d". I attempted to create the flag with the following code (this is similar to the code I use to flag the first and last records for each VIN except I use unique(d$vin) in place of mi>=400000):

d[,flag.400k:=0L]
d[mi>=400000L,flag.400k:=1L,mult="first",by=vin]

Here is the result for one vehicle. It flags every record where mi>=400000 instead of just the first record where mi>=400000

                 vin  msg.month       mi flag.400k
 1: 1M1AQ08Y2DM000000 2016-02-01 392278.1         0
 2: 1M1AQ08Y2DM000000 2016-02-01 394218.0         0
 3: 1M1AQ08Y2DM000000 2016-02-01 396853.0         0
 4: 1M1AQ08Y2DM000000 2016-02-01 399803.6         0
 5: 1M1AQ08Y2DM000000 2016-03-01 400120.6         1
 6: 1M1AQ08Y2DM000000 2016-03-01 402425.7         1
 7: 1M1AQ08Y2DM000000 2016-03-01 405176.6         1
 8: 1M1AQ08Y2DM000000 2016-03-01 407490.1         1
 9: 1M1AQ08Y2DM000000 2016-03-01 410755.9         1
10: 1M1AQ08Y2DM000000 2016-04-01 411009.2         1

This is what I would like to happen:

                 vin  msg.month       mi flag.400k
 1: 1M1AQ08Y2DM000000 2016-02-01 392278.1         0
 2: 1M1AQ08Y2DM000000 2016-02-01 394218.0         0
 3: 1M1AQ08Y2DM000000 2016-02-01 396853.0         0
 4: 1M1AQ08Y2DM000000 2016-02-01 399803.6         0
 5: 1M1AQ08Y2DM000000 2016-03-01 400120.6         1
 6: 1M1AQ08Y2DM000000 2016-03-01 402425.7         0
 7: 1M1AQ08Y2DM000000 2016-03-01 405176.6         0
 8: 1M1AQ08Y2DM000000 2016-03-01 407490.1         0
 9: 1M1AQ08Y2DM000000 2016-03-01 410755.9         0
10: 1M1AQ08Y2DM000000 2016-04-01 411009.2         0

I feel like there is simple that I am missing but I have tried quite a few different approaches and can't achieve the desired result. Please help.

Uwe
  • 41,420
  • 11
  • 90
  • 134
cj.burrow
  • 119
  • 1
  • 7
  • 1
    `mult` doesn't do anything unless you're doing a join, I think. I'd try: `w = d[mi>=400000L, .I[1L], by=vin]; d[w, flag := 1L]` (though you don't actually need to create the intermediate object). – Frank Jan 13 '17 at 19:40
  • 1
    @Frank That worked except one minor change `d[w$V1, flag.400k := 1L]` Thanks so much! I will have to learn more about `.I` as I have never used it before. – cj.burrow Jan 13 '17 at 20:20

0 Answers0