sorry if this has an obvious answer. i'm trying to perform a reshape that has lots of stackoverflow answers when only one column gets used or when the column names can be hardcoded, but i need an answer that will work dynamically when the ordered.cols
and unique.cols
vectors are not set from the beginning
# these two sets of columns need to be dynamic
# they might be any two sets of columns!
ordered.cols <- c( 'cyl' , 'gear' )
unique.cols <- c( 'am' , 'vs' )
# neither of the above two character vectors will be known beforehand
# so here's the example starting data set
x <- mtcars[ , c( ordered.cols , unique.cols ) ]
# the desired output should have this many records:
unique( x[ , ordered.cols ] )
# but i'm unsure of the smartest way to add the additional columns that i want--
# for *each* unique level in *each* of the variables in
# `unique.cols` there should be one additional column added
# to the final output. then, for that `ordered.cols` combination
# the cell should be populated with the value if it exists
# and NA otherwise
desired.output <-
structure(list(cyl = c(4L, 4L, 4L, 6L, 6L, 6L, 8L, 8L), gear = c(3L,
4L, 5L, 3L, 4L, 5L, 3L, 5L), am1 = c(0L, 0L, 1L, 0L, 0L, 1L,
0L, 1L), am2 = c(NA, 1L, NA, NA, 1L, NA, NA, NA), vs1 = c(1L,
1L, 0L, 1L, 0L, 0L, 0L, 0L), vs2 = c(NA, NA, 1L, NA, 1L, NA,
NA, NA)), .Names = c("cyl", "gear", "am1", "am2", "vs1", "vs2"
), class = "data.frame", row.names = c(NA, -8L))
desired.output
i don't really care if the new columns are named am1, am2, vs1, vs2, or something more convenient. but if there are two distinct values of am
in the data, there need to be two data-holding columns in the final output, one of which should be missing if that combination doesn't have the value.
# second example #
ordered.cols <- c( 'a' , 'b' )
unique.cols <- 'd'
# starting data set
y <-
data.frame(
a = c( 1 , 1 , 1 , 2 ) ,
b = c( 1 , 1 , 2 , 2 ) ,
d = c( 'z' , 'y' , 'x' , 'x' )
)
# the desired output here should have this many rows..
unique( y[ , ordered.cols ] )
# now the contents of all columns in `unique.cols`
# (which in this case is only column `d`)
# need to be appended as a widened data set
second.desired.output <-
data.frame(
a = c( 1 , 1 , 2 ) ,
b = c( 1 , 2 , 2 ) ,
d1 = c( 'z' , 'x' , 'x' ) ,
d2 = c( 'y' , NA , NA )
)
second.desired.output
thanks!!!!!!