A given species is found in a range of temperatures normally distributed about a mean of 15º. I'm trying to create a dataframe of the proportion of that species by temperature, where the range of all temperatures (0º-30º)is greater than the range of the species.
However, when I try to buffer the ends of the species distribution with 0's to get the full 0-30º range by merging the species' distribution data with a dataframe of all temperatures, the merge
function seems to skip some of the values from one dataset when combining it with the other:
df <- as.data.frame(rnorm(100, mean=15, sd=2))
# Change column name
colnames(df) <- "x"
# Calculate density
d <- density(df$x,na.rm=T)
# Check it (looks like a normalish distribution)
plot(d)
# As points
plot(d$x, d$y)
# Convert those points to a dataframe
d1 <- cbind(as.data.frame(d$x), as.data.frame(d$y))
colnames(d1) <- c("x", "y")
# Round x to 0.1
d1$x <- (round(d1$x,1))
# Aggregate by x, calling the new columns temperature and proportion
d2 <- aggregate(list(proportion=d1$y), by=list(temperature=d1$x), FUN="mean")
# Round proportion to 0.001
d2$proportion <- round(d2$proportion, 3)
# Create a vector of temperatures from 0-30 in increments of 0.1
alltemps <- as.data.frame(seq(0,30, by=0.1))
# Change the column heading
colnames(alltemps) <- "temperature"
# Merge the two datasets by temperature
d3 <- merge(alltemps, d2, all.x=T)
From here, I would convert all NAs to 0. But as you can see, merge skips some of the values from d2, putting in NAs where there should be values from d2.
Starting at temperature = 7.2, d2 has a corresponding proportion for each 0.1º temperature increment:
> d2
temperature proportion
1 7.2 0.000
2 7.3 0.000
3 7.4 0.000
4 7.5 0.000
5 7.6 0.000
6 7.7 0.001
7 7.8 0.001
8 7.9 0.001
9 8.0 0.001
10 8.1 0.002
11 8.2 0.002
12 8.3 0.003
13 8.4 0.003
14 8.5 0.004
15 8.6 0.005
16 8.7 0.006
17 8.8 0.008
18 8.9 0.009
19 9.0 0.010
20 9.1 0.011
21 9.2 0.013
22 9.3 0.014
...
140 21.1 0.004
141 21.2 0.003
142 21.3 0.003
143 21.4 0.002
144 21.5 0.002
145 21.6 0.001
146 21.7 0.001
147 21.8 0.001
148 21.9 0.000
149 22.0 0.000
150 22.1 0.000
151 22.2 0.000
152 22.3 0.000
153 22.4 0.000
alltemps has an increment of 0.1º from 0.0º to 30.0º:
> alltemps
temperature
1 0.0
2 0.1
3 0.2
4 0.3
5 0.4
6 0.5
7 0.6
8 0.7
9 0.8
10 0.9
11 1.0
...
69 6.8
70 6.9
71 7.0
72 7.1
73 7.2
74 7.3
75 7.4
76 7.5
...
221 22.0
222 22.1
223 22.2
224 22.3
225 22.4
226 22.5
227 22.6
228 22.7
229 22.8
...
296 29.5
297 29.6
298 29.7
299 29.8
300 29.9
301 30.0
But when you combine them, 'merge' skips some of the values that should be added from d2 (e.g. at 7.3, 7.6, 7.8, etc.):
> d3
temperature proportion
1 0.0 NA
2 0.1 NA
3 0.2 NA
4 0.3 NA
5 0.4 NA
6 0.5 NA
7 0.6 NA
8 0.7 NA
9 0.8 NA
10 0.9 NA
11 1.0 NA
...
71 7.0 NA
72 7.1 NA
73 7.2 0.000
74 7.3 NA
75 7.4 0.000
76 7.5 0.000
77 7.6 NA
78 7.7 0.001
79 7.8 NA
80 7.9 0.001
81 8.0 0.001
82 8.1 0.002
...
151 15.0 0.186
152 15.1 NA
153 15.2 NA
154 15.3 0.183
155 15.4 0.181
156 15.5 0.178
157 15.6 NA
158 15.7 NA
159 15.8 0.168
160 15.9 0.164
161 16.0 0.159
162 16.1 0.154
163 16.2 0.149
164 16.3 0.144
165 16.4 NA
166 16.5 0.132
...
What's happening here? Is this because d1 is generated from a kernel density estimate rather than real numbers?