I have a data.table full of some consumer products. I've created some distinction for the products as 'low'
, 'high'
, or 'unknown'
quality. The data are time series, and I'm interested in smoothing out some seasonality in the data. If a product's raw classification (the classification churned out by the algorithm I used to determine quality) is 'low'
quality in period X, but its raw classification was 'high'
quality in period X-1, I'm reclassifying that product as 'high'
quality for period X. This process is done within some sort of product group distinction.
To accomplish this, I've got something like the following:
require(data.table)
# lag takes a column and lags it by one period,
# padding with NA
lag <- function(var) {
lagged <- c(NA,
var[1:(length(var)-1)])
return(lagged)
}
set.seed(120)
foo <- data.table(group = c('A', rep(c('B', 'C', 'D'), 5)),
period = c(1:16),
quality = c('unknown', sample(c('high', 'low', 'unknown'), 15, replace = TRUE)))
foo[, quality_lag := lag(quality), by = group]
foo[, quality_1 := ifelse(quality == 'low' & quality_lag == 'high',
'high',
quality)]
Taking a look at foo
:
group period quality quality_lag quality_1
1: A 1 unknown NA unknown
2: B 2 low NA NA
3: C 3 high NA high
4: D 4 low NA NA
5: B 5 unknown low unknown
6: C 6 high high high
7: D 7 low low low
8: B 8 unknown unknown unknown
9: C 9 high high high
10: D 10 unknown low unknown
11: B 11 unknown unknown unknown
12: C 12 low high high
13: D 13 unknown unknown unknown
14: B 14 high unknown high
15: C 15 high low high
16: D 16 unknown unknown unknown
So, quality_1
is mostly what I want. If period X is 'low'
and period X-1 is 'high'
, we see the reclassification to 'high'
occurs and everything is left mostly intact from quality
. However, when quality_lag
is NA, 'low'
gets reclassified to NA
in quality_1
. This is not an issue with 'high'
or 'unknown'
.
That is, the first four rows of foo
should look like this:
group period quality quality_lag quality_1
1: A 1 unknown NA unknown
2: B 2 low NA low
3: C 3 high NA high
4: D 4 low NA low
Any thoughts on what is causing this?