Hey :) I am currently trying to clean up some data and I am struggling to find an easy solution for this. This is my dataset:
structure(list(sample = c(1, NA, NA, 2, NA, NA, 3, NA, NA, 4,
NA, NA, 5, NA, NA, 6, NA, NA, 7, NA, NA, 8, NA, NA, 9, NA, NA,
10, NA, NA, 11, NA, NA, 12, NA, NA, 13, NA, NA, 14, NA, NA, 15,
NA, NA, 16, NA, NA, 17, NA, NA, 18, NA, NA, 19, NA, NA, 20, NA,
NA), well = c("C1", "C3", "C5", "D1", "D3", "D5", "E1", "E3",
"E5", "F1", "F3", "F5", "C7", "C9", "C11", "D7", "D9", "D11",
"E7", "E9", "E11", "F7", "F9", "F11", "C13", "C15", "C17", "D13",
"D15", "D17", "E13", "E15", "E17", "F13", "F15", "F17", "C19",
"C21", "C23", "D19", "D21", "D23", "E19", "E21", "E23", "F19",
"F21", "F23", "G1", "G3", "G5", "H1", "H3", "H5", "I1", "I3",
"I5", "J1", "J3", "J5"), interp_conc = c(456582, 299611, 338462,
449737, 395905, 546031, 511817, 473617, 455924, 408370, 461656,
429297, 277609, 264949, 404073, 353142, 277509, 246494, 122663,
163873, 169455, 188879, 192751, 255511, 185383, 205396, 187415,
1897500, 1988346, 1854167, 365514, 295724, 262695, 270446, 241531,
209386, 223774, 255885, 181214, 420567, 482818, 443318, 262886,
220969, 283763, 229457, 261859, 202067, 226157, 177300, 215454,
481414, 586233, 383855, 218949, 226852, 244989, 192648, 228195,
201096)), row.names = c(NA, -60L), class = c("tbl_df", "tbl",
"data.frame"))
It's data from an experiment done in triplicates. This means, the first three rows are sample 1, the next three rows are sample 2, ...
So basically what I need is a function that whenever it finds an NA it takes the value from the row above. Is there something like this in R? I was not able to find one.
What I tried to do instead was to just add another column - "condition" - using the mutate function. Since the experiment I did was performed five times, I was hoping that the vector would just be recycled. This was my try:
temp %>% mutate(condition = c("UT", "UT", "UT",
"Stimuli", "Stimuli","Stimuli",
"Inhib1", "Inhib1","Inhib1",
"Inhib2", "Inhib2", "Inhib2"))
But since it does not seem possible to do vector recycling with the dplyr::mutate function I also was not able to do this.
Going with this second approach would have the advantage that it directly adds crucial information that I would otherwise have to add in a second step. My original idea was to first solve the sample column issue and then, using if statements, add the experimental condition...
Does anyone have any idea how I could solve this problem?