I have a data frame of depths for a number of different stations in a lake, and I want every station to have a complete sequence of depths from the min to the max (and missing values filled with NAs).
I am using tidyr::complete to do this, but it is behaving oddly. When my depths are rounded to zero decimal places, the code runs as expected, but when the data are to the tenth of a meter, something odd happens to the class of depth and some combinations are completed (and values filled with NA) even though I already have data for that depth.
Has anyone experienced this before? I assume it has something to do with the class of depth, but I haven't quite figured it out or how to avoid it.
library(dplyr)
b <- data.frame(site = c(rep("A", 10), rep("B", 10)),
depth = c(seq(0.1, 0.8, 0.1), 1.0, 1.1, seq(0.3, 0.5, 0.1), seq(0.9, 1.5, 0.1)),
value = round(runif(20, 0, 5), 1))
b2 <- b %>%
mutate(site = factor(site)) %>%
group_by(site) %>%
tidyr::complete(depth = seq(min(depth),
max(depth),
by = 0.1)) %>%
arrange(site, depth)
Some depths from the original data frame are duplicated, which is unexpected.
class(b2$depth)
unique(b2$depth)
b2[b2$site == "B", ]
When I convert out of numeric and back to numeric, depth seems to have reverted to what I would expect, although I still need to remove the duplicated depths with NAs.
class(as.numeric(as.character(b2$depth)))
unique(as.numeric(as.character(b2$depth)))
If depths have no decimal places, the behaviour seems more predictable.
a <- data.frame(site = c(rep("A", 10), rep("B", 10)),
depth = c(1:4, 6:11, 3:5, 8, 10, 12:16),
value = round(runif(20, 0, 5), 1))
a2 <- a %>%
mutate(site = factor(site)) %>%
group_by(site) %>%
tidyr::complete(depth = seq(min(depth),
max(depth),
by = 1)) %>%
arrange(site, depth)
class(a2$depth)
unique(a2$depth)
a2[a2$site == "B", ]