Up front, please see the note about floating-point equality below. While it may not bite you with this data, one problem with floating-point equality filtering is that you may not know it is occurring, and your calcs will be incorrect.
Two alternative solutions:
tidyverse, take 1
library(dplyr)
fdata %>%
arrange(-wlength) %>%
filter(wlength %in% c(352L, 350L)) %>%
group_by(date, ID) %>%
filter(n() == 2L) %>%
summarize(
quux = diff(R) / sum(R),
.groups = "drop"
)
# # A tibble: 4 x 3
# date ID quux
# <chr> <chr> <dbl>
# 1 2011-04-12 c01 -0.223
# 2 2011-04-12 c02 -0.152
# 3 2011-04-13 c01 -0.120
# 4 2011-04-13 c02 0.745
tidyverse, take 2
func <- function(wl, r, wavelengths = c(800, 670)) {
inds <- sapply(wavelengths, function(w) {
diffs <- abs(wl - w)
which(diffs < 1)[1]
})
diff(r[inds]) / sum(r[inds])
}
fdata %>%
group_by(date, ID) %>%
summarize(
quux = func(wlength, R, c(352, 350)),
.groups = "drop"
)
# # A tibble: 4 x 3
# date ID quux
# <chr> <chr> <dbl>
# 1 2011-04-12 c01 -0.223
# 2 2011-04-12 c02 -0.152
# 3 2011-04-13 c01 -0.120
# 4 2011-04-13 c02 0.745
Floating-point Equality
Your wlength
is a numeric
field, and testing for strict-equality with floating-point numbers does have its occasional risks. Computers have limitations when it comes to floating-point numbers (aka double
, numeric
, float
). This is a fundamental limitation of computers in general in how they deal with non-integer numbers. This is not specific to any one programming language. There are some add-on libraries or packages that are much better at arbitrary-precision math, but I believe most main-stream languages (this is relative/subjective, I admit) do not use these by default. Refs: Why are these numbers not equal?, Is floating point math broken?, and https://en.wikipedia.org/wiki/IEEE_754.
integer
strict equality is not a problem, and in my sample data they are integers. You have a few options for dealing with this, typically injecting/replacing components of the %>%
-pipe.
Convert to integer,
mutate(wlength = as.integer(wlength))
Filter with specific tolerance, perhaps
filter(abs(wlength - 800) < 0.1 | abs(wlength - 670) < 0.1)
Temporary conversion,
filter(sprintf("%0.0f", wlength) %in% c("800", "670"))
(not the most efficient, but effective and can support off-integer wavelengths).
Data
fdata <- read.table(header = TRUE, text = "
date wlength ID
2011-04-12 350 c01
2011-04-12 351 c01
2011-04-12 352 c01
2011-04-12 353 c01
2011-04-12 354 c01
2011-04-12 355 c01
2011-04-13 350 c01
2011-04-13 351 c01
2011-04-13 352 c01
2011-04-13 353 c01
2011-04-13 354 c01
2011-04-13 355 c01
2011-04-12 350 c02
2011-04-12 351 c02
2011-04-12 352 c02
2011-04-12 353 c02
2011-04-12 354 c02
2011-04-12 355 c02
2011-04-13 350 c02
2011-04-13 351 c02
2011-04-13 352 c02
2011-04-13 353 c02
2011-04-13 354 c02
2011-04-13 355 c02
")
set.seed(2021)
fdata$R <- round(runif(nrow(fdata)), 3)