I am trying to work out how I would combine an ifelse statement with the shift function in data.table. My data looks like this:
DF <- structure(list(CHR = c(1, 1, 1, 1, 1,1),
SNP = c("rs2494631", "rs4648637", "rs2494627", "rs11122119", "rs1844583","rs2292242"),
BP = c(2399149, 2401364, 2402499, 6768856, 8383469, 8385059),
KBdist= c(NA, 2215, 1135, 4366357, 1614613, 1590),
locus = c(1, NA, NA, NA, NA, NA)),
.Names = c("CHR","SNP","BP","KBdist","locus"),
row.names = c(NA, 6L),
class = "data.frame")
> df
CHR SNP BP KBdist locus
1 rs2494631 2399149 NA 1
1 rs4648637 2401364 2215 NA
1 rs2494627 2402499 1135 NA
1 rs11122119 6768856 4366357 NA
1 rs1844583 8383469 1614613 NA
1 rs2292242 8385059 1590 NA
and what I am trying to achieve is: "If CHR is equal to the line above, and KBdist is less than 500,000, make locus equal to the line above, else add one to the value of the line above". Which would yield an output that looks like this:
CHR SNP BP KBdist locus
1 rs2494631 2399149 NA 1
1 rs4648637 2401364 2215 1
1 rs2494627 2402499 1135 1
1 rs11122119 6768856 4366357 2
1 rs1844583 8383469 1614613 3
1 rs2292242 8385059 1590 3
I know that I can use shift to access the values in the row above, for example:
DF<-DF[ , KBdist := BP - shift(BP, 1L, type="lag")]
As that is how I created one of the columns. But I don't see how you could extend it to including the ifelse statement conditions above.
Any help would be greatly appreciated.
Thanks in advance.