Combining lists into a dataframe more efficiently

Question

I am running multiple chains of a MCMCglmm() model and I am trying to find the most efficient way to synthesize my output.

I am using mclapply() to run 4 chains and then combining each of the 4 chains into a list with lapply().

Here is my model and code to clean up and combine the chains. I am using this helpful tutorial for running the chains: https://github.com/tmalsburg/MCMCglmm-intro

Model:

library(parallel)
chains <- mclapply(1:4, function(i) {
  MCMCglmm(outcome ~ 1 + pretest + race + satisfaction*race, data = data,
                                random = ~ provider,
                                prior = prior.1,
                                verbose = TRUE,
                                family = "gaussian",
                                nitt = 10000, 
                                burnin = 5000, 
                                thin = 10)
}, mc.cores=4)

My cleanup is a little clunky. Is there a way to run a lapply command (or I think what is needed is mapply) on both the fixed and random effects to combine them into the same list and subsequent data frame? In the end, I am hoping to have a data frame so I can add/ subtract posterior distributions and run summary statistics on them.

fixed <- lapply(chains, function(m) m$Sol) # Sol = fixed effects 
fixed <- do.call(mcmc.list, fixed)
summary(fixed)

random <- lapply(chains, function(m) m$VCV) # VCV = variance 
random <- do.call(mcmc.list, random)
summary(random)

fixed_df <- do.call(rbind, Map(data.frame, fixed))
random_df <- do.call(rbind, Map(data.frame, random))

chains_df <- cbind(fixed_df, random_df)

Ultimately, I am hoping to run one lapply() or mapply() and have a single fixed.random list of lists. I believe I can use the Map(data.frame, fixed.random) on that to create my data frame. My knowledge of the apply function is limited, so I'm hoping to learn more and apply it (no pun intended) to my datasets.

Unfortunately, the models output MCMC objects, so I am unable to create the exact structure. This is the best I can come up with:

list1 <- list(a = rnorm(100, 0, 1), b = rnorm(100, 0, 1))
list2 <- list(a = rnorm(100, 0, 1), b = rnorm(100, 0, 1))
list3 <- list(a = rnorm(100, 0, 1), b = rnorm(100, 0, 1))
list4 <- list(a = rnorm(100, 0, 1), b = rnorm(100, 0, 1))

list5 <- list(d = rnorm(100, 0, 1), e = rnorm(100, 0, 1))
list6 <- list(d = rnorm(100, 0, 1), e = rnorm(100, 0, 1))
list7 <- list(d = rnorm(100, 0, 1), e = rnorm(100, 0, 1))
list8 <- list(d = rnorm(100, 0, 1), e = rnorm(100, 0, 1))

fixed  <- list(list1, list2, list3, list4)
random <- list(list5, list6, list7, list8)

What is *mcm.list*? And please provide data sample for a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). — Parfait, Feb 18 '18 at 21:49
_mcmc.list_ is a function that represents parallel runs of the same chain ... i am unable to create the output as it is a MCMC object. I can try a non-MCMC object. — b222, Feb 18 '18 at 21:51
Essentially looking for something like `lapply(chains, function(m) c(m$Sol, m$VCV))` but that isn't right. — b222, Feb 18 '18 at 21:54
Reading [mcmc.list docs](https://www.rdocumentation.org/packages/coda/versions/0.19-1/topics/mcmc.list), it seems processing must be separate. You can make your code DRY-er to avoid repetition. — Parfait, Feb 18 '18 at 22:18

Thomas Guillerme · Answer 1 · 2018-02-21T05:46:18.700

Would the following do?

Say your four_mcmc is a list of models of the class "MCMCglmm" (chain1, chain2, etc.) and extract is the list of elements you want to read from the chains (in your case the fixed ("Sol") and random terms ("VCV")).

## The list of mcmcs
four_mcmc <- list(chain1, chain2, chain3, chain4)

## Which elements to extract from the MCMCs
extract <- c("VCV", "Sol")

You can use a get.element function to extract single elements lists from single chains:

## Extracting some specific elements from a chain
get.elements <- function(extract, mcmc) {
    ## Extracting the element
    mcmc_elements <- sapply(extract, function(extract) mcmc[which(names(mcmc) == extract)])
}

## Extracting the VCV and Sol from one chain
str(get.elements(extract, chain1))

You can then simply apply this function to your list of chains:

## Applying get.element for each elements to extract on each chain
all_elements <- lapply(four_mcmc, function(mcmc, extract) get.elements(extract, mcmc), extract)

You can then easily summarise this table for each terms as a data frame with the terms as rows and the chains as columns

## Fixed terms table
fixed_terms <- as.data.frame(lapply(all_elements, function(X) X[[1]]))
## Random terms table
random_terms <- as.data.frame(lapply(all_elements, function(X) X[[2]]))

This code is simplified from the read.mulTree function from https://github.com/TGuillerme/mulTree.

[edit] @headpoint suggested to simply use:

as.data.frame(lapply(chains, function(m) cbind(m$Sol, m$VCV)))

Which is more elegant but could be less portable.

Combining lists into a dataframe more efficiently

1 Answers1