Following the example structure given by user tpetzdoldt in the answer to (Using predictNLS to create confidence intervals around fitted values in R?),
library(tidyverse)
library(investr)
data <- tibble(date = 1:7,
cases = c(0, 0, 1, 4, 7, 8.5, 8.5))
model <- nls(cases ~ SSlogis(log(date), Asym, xmid, scal), data= data )
new.data <- data.frame(date=seq(1, 10, by = 0.1))
interval <- as_tibble(predFit(model, newdata = new.data, interval = "confidence", level= 0.9)) %>%
mutate(date = new.data$date)
I then attempted to apply these same concepts to my own data (reproducible version generated here):
#Trying to create a reproducible example:
string_temp <- c(5, 12, 43, 12, 0.5, 11, 16, 15, 10, 8)
string_resp <- c(22, 15, 106, 18, 9, 14, 32, 11, 1, 4)
string_id <- c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K",
"L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V")
temp <- rep(string_temp, 220)
resp <- rep(string_resp, 220)
id <- rep(string_id, 100)
data_model <- data.frame(temp, resp, id)
#Data for predictions:
predictions <- runif(122735)
predictions <- data.frame(predictions)
predictions <- predictions %>% rename(temp = predictions)
#Split by identity:
data_model_split <- data_model %>% split(data_model$id)
#model:
model <- lapply(data_model_split, function(d) nls(resp ~ a * exp(b * temp),
start = list(a = 0.8, b = 0.1),
data = d))
#results:
results <- lapply(1:2, function(i) {
predFit(model[[i]], newdata = predictions, interval = "confidence", level = 0.9)})
I get the following error:
Error: cannot allocate vector of size 112.2 Gb
It seems strange that these adjustments would generate a data frame of that size. The dataframe that was generated in the example above was only 4 columns wide. I am feeding the 22 models that "model" generates 122,000 rows, but I still am shocked and sure that the hypothetical 4 column x 3,000,000 data frame that it produces shouldn't be nearly 1 Gb large. Is something going wrong with my application of lapply() in this case? I apologize for the lack of reproducibility in my personal example, as the dataset is very large, but I hope that maybe the issue lies somewhere in my code rather than my dataset. If helpful, I can try and generate a reproducible proxy for my data.