When writing functions, the following function would work if it is given a data.table by name:
myDelta <- function(DT, col.a = "Sepal.Length", baseline = 5){
DT[, delta := get(col.a) - baseline]
return(DT[])
}
It could be called like this:
library(data.table)
irisDT <- data.table(iris)
myDelta(irisDT)
However this has a few problems:
- Assigning the output to a new object will work, but the original is modified in place, so this can be an awkward side effect
- I don't assume (though I haven't tested) that this is using the best of all of data.tables fancy fastness
- This is not using the 'data.table way', which would be more
irisDT[, myDelta()]
but because it expects a DT argument which is a data.table, I am repeating myself by writingirisDT[, myDelta(irisDT)]
.
Explicitly, I would like to know: What I am missing about writing functions which allows them to inherit from the data.table object they are called in without the data.table object having to be provided from the function arguments
Additionally I am curious about: What best practice would be for writing a function which can be called from inside, or outside, a data.table object in this kind of use case, where the goal is to calculate an output column from existing columns in the object. Do you write for just one or the other?
I may have this entirely backwards though, if so please let me know.