I am analyzing the results of an experiment on a CSV file with variables as columns and participants as rows. Before all my data is collected, I would like to conduct preliminary analyses on the data I already have. However, I need to exclude some of my participants from the analyses. The best way I have come up with to do this without deleting their data (which could cause problems for me later) is to create a new column, call it "exclude," and enter in either a 1 or 0 for each participant to define who is to be excluded. Then when I run the the stats, I just do it on a subset of my data (where exclude == 0, for example).
The problem comes in when I download the complete dataset - how do I get data from my "exclude" column of the preliminary dataset onto the complete dataset, making sure that all the 0s and 1s are attached to the correct participants? I can see how I could just copy and paste if the rows of the preliminary and complete datasets are in the exact same order, but this seems prone to error, and in order to create the exclude column it's a lot easier to sort by different columns. I've tried rbind
and merge
but they do not work as far as I can tell.
Here is an example of what I'm trying to do:
prelim <- data.frame(
participant = c(1,2,3),
exclude = c(0,1,0)
)
full = data.frame(
participant = c(1,2,3,4,5),
exclude = c(NA,NA,NA,NA,NA)
)
ideal = data.frame(
participant = c(1,2,3,4,5),
exclude = c(0,1,0,NA,NA)
)