I have a dataset where the precision of measurements varied between years, however the precision used in each year was not recorded. Therefore, I want to infer the precision based on the values of the measurements. For example, in a given year, if all the measurements end in 0, then I can infer the measurements were to the closest 10. Similarly, if all the measurements end in 0's or 5's, then I can infer the measurements were to the closest 5.
Below is my example dataset and the function I am using to infer the precision of the values for each site in each year. The str_sub function is throwing an error, that I don't know how to resolve, and this error persists regardless of whether I include the first line (or which option) to create precisionType.
library(dplyr)
library(stringr)
# example dataset with 3 years and 2 sites
# - first year measurements were to the closest 10
# - second year measurements were to the closest 5
# - third year measurements were to the closest 1
test <- data.frame(
year = rep(c(rep(1995, 10), rep(1996, 10), rep(1997, 10)), 2),
site = c(rep(LETTERS[1], 30), rep(LETTERS[2], 30)),
mass = rep(c(rep(c(10,20,30,40,50), 2), rep(c(10, 25, 30, 35, 50), 2), rep(c(18, 25, 32, 44, 57), 2)), 2))
# function for inferring precision
fun_precision <- function(df, measurement, unit) {
# set type of precision (e.g., LENGTH, MASS): 2 options and I have tried both
precisionType <- paste(quo_name(enquo(measurement)), "PRECISION", sep = "_") # option 1
# precisionType <- paste(deparse(substitute(measurement)), "PRECISION", sep = "_") # option 2
# determine precision (i.e., by1, by5 or by10 units)
precision <- df %>%
# determine last digit in measurement
mutate(last = as.numeric(str_sub(measurement, -1, -1))) %>%
# group by year, river and last digit
group_by(year, site, last) %>%
# count number of times the last digit occurs
summarise(n = n()) %>%
# arrange dataframe from smallest to largest last digits
arrange(last) %>%
# switch from long to wide format
pivot_wider(id_cols = c(year, site),
names_from = last,
names_prefix = "n", #appends n before last digit (e.g., n1, n2, etc.)
values_from = n) %>%
# determine precision of measurement
mutate(
not0 = sum(c_across(c(n1:n9)), na.rm = TRUE),
not0or5 = sum(c_across(c(n1:n4, n6:n9)), na.rm = TRUE),
"{precisionType}" = case_when(nNA > 0 & not0 == 0 ~ NA_character_,
not0 == 0 ~ paste("By10", unit, sep = ""),
not0or5 > 0 ~ paste("By1", unit, sep = ""),
!is.na(n0) & !is.na(n5) ~ paste("By5", unit, sep = ""))) %>%
select(year, site, precisionType)
# join to initial dataset
df <- left_join(df, precision)
}
test <- fun_precision(test, mass, "g")
Error in `mutate()`:
! Problem while computing `last = as.numeric(str_sub(measurement, -1, -1))`.
Caused by error in `stri_sub()`:
! object 'mass' not found
Run `rlang::last_error()` to see where the error occurred.