The error appears to be in transform.data.frame
and how it is (re)assigning the column.
transform.data.frame
# function (`_data`, ...)
# {
# e <- eval(substitute(list(...)), `_data`, parent.frame())
# tags <- names(e)
# inx <- match(tags, names(`_data`))
# matched <- !is.na(inx)
# if (any(matched)) {
# `_data`[inx[matched]] <- e[matched]
# `_data` <- data.frame(`_data`)
# }
# if (!all(matched))
# do.call("data.frame", c(list(`_data`), e[!matched]))
# else `_data`
# }
# <bytecode: 0x000000000a34e4b0>
# <environment: namespace:base>
Specifically, if any(matched)
then it uses
`_data`[inx[matched]] <- e[matched]
which works. This is the case in your df2
example, because you reassign over an existing variable, nested
. If you chose to assign to a non-existent variable, however, it also fails:
transform(df2, nested2 = strsplit(nested, ", "))
# Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, :
# arguments imply differing number of rows: 3, 2, 1
If the column does not exist (as is the case in the original df
), then
do.call("data.frame", c(list(`_data`), e[!matched]))
fails.
If we pre-assign df$Begin_New
, it works.
df$Begin_New <- NA
str(transform(
df,
Begin_New = Map(seq, Begin, End - 6000, by = 1000) # or mapply(...)
))
# 'data.frame': 6 obs. of 5 variables:
# $ ID : chr "A01" "A01" "A01" "A01" ...
# $ Period : chr "Baseline" "Run" "Recovery" "Baseline" ...
# $ Begin : num 0 30500 68500 2000 45000 135000
# $ End : num 30500 68500 158000 43000 135000 305000
# $ Begin_New:List of 6
# ..$ : num 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 ...
# ..$ : num 30500 31500 32500 33500 34500 35500 36500 37500 38500 39500 ...
# ..$ : num 68500 69500 70500 71500 72500 73500 74500 75500 76500 77500 ...
# ..$ : num 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 ...
# ..$ : num 45000 46000 47000 48000 49000 50000 51000 52000 53000 54000 ...
# ..$ : num 135000 136000 137000 138000 139000 140000 141000 142000 143000 144000 ...
Perhaps this is a bug in transform.data.frame
, it does seem odd to have the inconsistent behavior due solely to the (discarded) preexistence of the column. If we change the new-variable assignment to something like this:
transform2 <- function (`_data`, ...) {
e <- eval(substitute(list(...)), `_data`, parent.frame())
tags <- names(e)
inx <- match(tags, names(`_data`))
matched <- !is.na(inx)
if (any(matched)) {
`_data`[inx[matched]] <- e[matched]
`_data` <- data.frame(`_data`)
}
if (!all(matched)) {
`_data`[ncol(`_data`) + seq_len(sum(!matched))] <- e[!matched]
`_data` <- data.frame(`_data`)
}
`_data`
}
Then it works. (I have not tested for everything else transform.data.frame
is supposed to handle, but perhaps this should be a bug-report/patch-request to R-devel.)