0

Say I have the following data.table

> library(data.table)
> data<-fread("savedata.csv",sep=",")
> head(data)
   V1              name      year       value
1:  1              brand1    2000       3
2:  2              brand2    2000       3
3:  3              brand3    2000       1
4:  4              brand4    2000       0
5:  5              brand5    2000       3
6:  6              brand6    2000       1
> tail(data)
    V1             name      year      value
1: 435             brand15   2020      24
2: 436             brand16   2020      22
3: 437             brand17   2020      19
4: 438             brand18   2020      21
5: 439             brand19   2020      20
6: 440             brand20   2020      14

I perform the following function which works perfect;

data <- data[data[, .(year = seq.int(year[1], year[.N], 0.1)), by=name], on=c('year', 'name')][,value := zoo::na.approx(value, na.rm = FALSE)]

> head(data)
   V1         name       year       value
1:  1         brand1     2000.0     3.0
2: NA         brand1     2000.1     3.3
3: NA         brand1     2000.2     3.6
4: NA         brand1     2000.3     3.9
5: NA         brand1     2000.4     4.2
6: NA         brand1     2000.5     4.5
> tail(data)
    V1        name       year       value
1:  NA        brand20    2019.5     13.5
2:  NA        brand20    2019.6     13.6
3:  NA        brand20    2019.7     13.7
4:  NA        brand20    2019.8     13.8
5:  NA        brand20    2019.9     13.9
6: 440        brand20    2020.0     14.0

I would like to add another column to the data.table to include value at the beginning of time period, something like,

> head(data)
   V1         name       col i want    year       value
1:  1         brand1     3.0           2000.0     3.0
2: NA         brand1     3.0           2000.1     3.3
3: NA         brand1     3.3           2000.2     3.6
4: NA         brand1     3.6           2000.3     3.9
5: NA         brand1     3.9           2000.4     4.2
6: NA         brand1     4.2           2000.5     4.5
> tail(data)
    V1        name       col i want    year       value
1:  NA        brand20    13.4          2019.5     13.5
2:  NA        brand20    13.5          2019.6     13.6
3:  NA        brand20    13.6          2019.7     13.7
4:  NA        brand20    13.7          2019.8     13.8
5:  NA        brand20    13.8          2019.9     13.9
6: 440        brand20    13.9          2020.0     14.0

Thoughts?

0 Answers0