0

I have a set of 94 matrices within a list in R. Each matrix is a different size; a sample is shown below:

 > summary(full_matrix)
            Length Class  Mode   
 Alex_1         64 -none- numeric
 Alex_10      2500 -none- numeric
 Alex_11      2916 -none- numeric
 Alex_12     20736 -none- numeric
 Alex_13     28900 -none- numeric
 Alex_14     62500 -none- numeric
 Alex_15     93025 -none- numeric
 Alex_2        100 -none- numeric
 Alex_3         25 -none- numeric
 Alex_4       1225 -none- numeric
 Alex_5       2304 -none- numeric
 Alex_6       1849 -none- numeric

I want to extract data from each matrix using lapply(). I'm performing a cluster analysis on each matrix, which generates a subset of clusters for each. I can do this using the following code:

 library(pvclust)

 clustering_data <- lapply(full_matrix, FUN = function(element) {
   result <- pvclust(element, method.dist="cor", method.hclust="average", nboot=1000, parallel=TRUE)
   output <- pvpick(result, alpha=0.95, pv="au", type="geq", max.only=TRUE)
   })

For clustering_data[[1]], for example, this gives me:

 > clustering_data[[1]]
 $clusters
 $clusters[[1]]
 [1] "bah"   "hello" "huh"   "ooh"   "wee"   "woo"  

 $edges
 [1] 5

The problem is that I need to be able to identify the name of the original matrix (Alex_1, Alex_2, etc) from which the cluster is generated, and I can't figure out how to do this. I have done it for a previous lapply() function using df %>% split(., f = .$var1), but I can't figure this out when the object is in a list.

Catherine Laing
  • 475
  • 6
  • 18
  • See https://stackoverflow.com/questions/9950144/access-lapply-index-names-inside-fun especially the map/mapply option for fairly long discussion of the issues here. – John Garland Apr 24 '20 at 14:52
  • Thanks for this. I changer the `lapply()` chunk to the following `clustering_data <- mapply(FUN = function(list.elem) {...}, list.elem = full_matrix, names = n)` But I got the error `Error in (function (list.elem) : unused argument (names = dots[[2]][[1]])`. I'm completely new to lists and `lapply()` so am finding this quite hard to navigate! Could you provide a bit more input on how to structure the code? – Catherine Laing Apr 24 '20 at 15:27
  • Please give an example how your desired output should look like. – Martin Gal Apr 24 '20 at 16:43

1 Answers1

0

You should already have the output in names of clustering_data. Check :

names(clustering_data)

Or

clustering_data[1]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • That didn't quite give me what I need, as rather than simply view the names, I want to attach them to each of the matrices in the list. This is what I did to sort the problem: `clustering_data_names <- sapply(clustering_data, `[`, "clusters")` – Catherine Laing Apr 30 '20 at 15:52
  • @CatherineLaing But I think your question was `I need to be able to identify the name of the original matrix (Alex_1, Alex_2, etc) from which the cluster is generated`. It doesn't give you name of the original matrix. Anyway, probably you meant `Map(cbind, names(full_matrix), sapply(clustering_data, \`[[\`, "clusters"))` – Ronak Shah May 01 '20 at 03:38