I have a large dataframe in R (data) made up of 23 .gazedata files (one for each subject):
filenames <- list.files("~/Desktop/DUT Analyses 2019", pattern = "*.gazedata", full.names = TRUE)
ldf <- lapply(filenames, read_tsv)
data <- do.call("rbind", ldf)
After creating factors and timing variables, I create the pupil variable, based on default validity parameters collected by the eye-tracker:
data$DiameterPupilLeftEye[data$ValidityLeftEye != 0] <- NA
data$DiameterPupilRightEye[data$ValidityRightEye != 0] <- NA
data$pupil = rowMeans(select(data, DiameterPupilLeftEye, DiameterPupilRightEye), na.rm = TRUE)
Now, I need to create an interpolated pupil variable (pupil_inter) to interpolate values to a maximum gap of 4:
data$pupil_inter<- na.approx(data$pupil, rule = 2, maxgap = 4)
However, the following error occurs:
Error in `$<-.data.frame`(`*tmp*`, pupil_inter, value = c(4.2120165, 4.20966425, :
replacement has 1810947 rows, data has 1810956
These row amounts are exactly the same every time.
Crucially, if I exclude subjects 22 & 23 .gazedata files from the pre-processing, the latter code works and there is no error
I have tried identifying an existing "replacement has [x] rows, data has [y]" problem to help with my specific issue, but can't find a relevant solution. All .gazedata files were collected using the same hardware and software.
The error persists, even when successfully creating a null pupil_inter variable first, using the following code:
data$pupil_inter <- NA
Thanks in advance for any advice offered.