1

Goal

Return items from a multiple function-generated lists as rows in a data frame.

Example data:

delete<-    structure(list(f = structure(c(2L, 3L, 2L, 1L, 1L, 4L, 4L, 5L, 3L, 5L), .Label = c("a", "b", "c", "d", "e"), class = "factor"),    n = c(3.86634168231333, 5.12320676294787, 7.43524756894488,     5.4483312206524, 4.6808590809153, 3.09435163890175, 3.3369519531068,     5.48377017072716, 3.22234879830383, 4.21889257443295), yi = c(7.8076602154975,    6.5323499682638, 4.59639499129274, 5.23921401222216, 6.06635185725809,    6.70450561710897, 6.80135195068635, 6.05939661908661, 6.57758084773293,    6.66031140517216), vi = c(0.974757327755975, 1.18471706886098,    0.887199336597602, 0.822433823427991, 0.988350739676306,    0.891992523606773, 0.882568180283011, 0.986873003631463,    0.970027651579457, 1.01797517893535)), .Names = c("f", "n","yi", "vi"), row.names = c(NA, -10L), class = "data.frame")

Attempt

Something that at least gets the right numbers and keeps track of the id column (the minimum requirement) "f" values, but I don't know how to manipulate:

for(i in unique(delete$f)) print(c(i,glm(data=delete[delete$f!=i,], yi~vi)$aic))

[1] "b"                "16.1840991165046"
[1] "c"                "25.6744104404786"
[1] "a"                "25.6185827181431"
[1] "d"               "24.600830735108"
[1] "e"                "26.4382751230764"

There are a number of posts (1,2,3,4) on here dealing with proper ways to do leave-one-out on individual lines as opposed to groups.

Problem

I don't know how to get a dataframe with the same content as this output. Solving that alone would be huge. A secondary issue is that I even if I overcome that problem, if I also wanted the model coefficients I'd have to replace $aic in the code above and then merge the two dataframes.

I know I could use lapply and split if I wanted results by subset of the data, how do I do the opposite and get results by subsets that exclude a specific group?

Summary

I want to do a leave-one-out analysis where I'm not leaving out individual observations but instead groups (as identified by a common level of vector "f" in this case).

Appendix

This is actually to be implemented for some of the output of metafor's rma function but I'm using glm as a more universal example (replace "glm" with "rma" and "aic" with "ci.lb" and that's my problem). Similar to what I said above, metafor has a built in leave1out function meant for leaving out single rows of data rather than groups.

CrunchyTopping
  • 803
  • 7
  • 17

1 Answers1

1

You can still use lapply. Here's one of many possible solutions.

results <- lapply(unique(delete$f), function(i) {
  data.frame(i=i, AIC=glm(data=delete[delete$f!=i,], yi~vi)$aic)
})

results_df <- do.call(rbind, results)

results_df
  i      AIC
1 b 16.18410
2 c 25.67441
3 a 25.61858
4 d 24.60083
5 e 26.43828
thc
  • 9,527
  • 1
  • 24
  • 39