I'm looking for a way to join (or perhaps merge) two or more data frames in R containing measured values with a specified range of error. This means that the value in the "by" column would be nnn.nnnn +/- 0.000n. The error tolerance limited to 3 e-6 times the value.
This is my best attempt so far.
newDF <- left_join(P0511_480k, P0511_SF00V, by=c(P0511_480k$m.z == (P0511_SF00V$m.z - 0.000003(P0511_480k$m.z)) : (P0511_SF00V$m.z + 0.000003(P0511_480k$m.z))))
In this expression, I have two data frames (P0511_480k
and P0511_SF00V
) and I would like to merge them by a column named "m.z". The acceptable range of values is plus or minus "m.z" times 0.000003. For example, P0511_480k_subset$m.z = 187.06162
should match P0511_SF00V_subset$m.z
= 187.06155
.
> dput(head(P0511_480k_subset, 10))
structure(list(m.z = c(187.06162, 203.05652, 215.05668, 217.07224,
279.05499), Intensity = c(319420.8, 288068.9, 229953, 210107.8,
180054), Relative = c(100, 90.18, 71.99, 65.78, 56.37), Resolution = c(394956.59,
415308.31, 387924.91, 437318.31, 410670.91), Baseline = c(2.1,
1.43, 1.69, 1.73, 3.04), Noise = c(28.03, 27.17, 27.52, 27.58,
29.37)), .Names = c("m.z", "Intensity", "Relative", "Resolution",
"Baseline", "Noise"), class = c("tbl_df", "data.frame"), row.names = c(NA,
-5L))
and
> dput(head(P0511_SF00V_subset, 10))
structure(list(m.z = c(187.06155, 203.05641, 215.05654, 217.0721
), Intensity = c(1021342.8, 801347.1, 662928.1, 523234.2), Relative = c(100,
78.46, 64.91, 51.23), Resolution = c(314271.88, 298427.41, 289803.97,
288163.63), Baseline = c(6.89, 10.47, 9.13, 8.89), Noise = c(40.94,
45.98, 44.3, 44.01)), .Names = c("m.z", "Intensity", "Relative",
"Resolution", "Baseline", "Noise"), class = c("tbl_df", "data.frame"
), row.names = c(NA, -4L))
I appreciate your suggestions! I've searched through the help documentation as broadly as possible and I have not been able to find an example that is close to what I need.
Many thanks!