1

I try to write a function to filter a list that has muti df by ID and then add label to each dataset. However my codes did not work. Could anyone take a look of my codes and give me some guidance on what I did wrong? I am not that good working with list using map.

My codes are:

Indivadual_xlsx<-function(Subject_id){
  datalist2 %>% map(~filter(., SubjectID == Subject_id)%>%
    setNames(rbind(names(.), x),  mapping.var$Label[match(names(.), mapping.var$Variable)])) %>%  
    write_xlsx(paste0(output_dir,Subject_id,".xlsx"))
}   

update:

Before:

enter image description here

after:

enter image description here

mapping.var:

enter image description here

Update:

Here is the sample data, where lst1 is for datalist2, and Label is for mapping.var:

lst<-list(Demographics = structure(list(SubjectID = c("101-01-101", 
"101-02-102", "101-03-103", "101-04-104", "104-05-201"), BRTHDTC = c("1953-07-07", 
"1963-07-02", "1940-09-11", "1955-12-31", "1950-12-04"), SEX = c("Female", 
"Female", "Male", "Male", "Female")), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame")), DiseaseStatus = structure(list(SubjectID = c("101-01-101", 
"101-02-102", "101-03-103", "101-04-104", "104-05-201"), DSDT = c("2016-03-14", 
"2017-04-04", NA, "2016-05-02", "2018-07-06"), DSDT_P = c(NA, 
NA, "UN-UNK-2015", NA, NA)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame")), Visits = structure(list(SubjectID = c("101-01-101", 
"101-02-102", "101-03-103", "101-04-104", "104-05-201"), Visit = c("Screening: -28 Days to Day 1", 
"Screening: -28 Days to Day 1", "Screening: -28 Days to Day 1", 
"Screening: -28 Days to Day 1", "Screening: -28 Days to Day 1"
), VISND = c(NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame")))

Label<-structure(list(Var = c("SubjectID", "BRTHDTC", "SEX", "DSDT", 
"DSDT_P", "Visit", "VISND"), label = c("Subject ID", "Birthday", 
"Gender", "DS Date", "DS Date Prob", "Date of Visit", "ND Visit"
)), row.names = c(NA, -7L), class = c("tbl_df", "tbl", "data.frame"
Stataq
  • 2,237
  • 6
  • 14
  • What is `mapping.var`. Also, what is `x` in `rbind` – akrun May 20 '21 at 16:59
  • I have a mapping file called `mapping.var`. Probably should not call that. in that, I have variable name and labels that I want to use, I am most confuse on how to use `.x `or `x` – Stataq May 20 '21 at 17:00
  • 1
    Can you describe what you really wanted because your code syntax may be wrong and this may not help in understanding – akrun May 20 '21 at 17:02
  • My list is called datalist2 which has multiple df,.I want to build one excel file for each ID. so I first filter the data by ID, then I try to bind the dat with label, then output them into excel. – Stataq May 20 '21 at 17:05
  • Suppose your `data_list2 <- list(df1 = data.frame(SubjectID = 1:3), df2 = data.frame(SubjectID = 1:6))` what should be the expected – akrun May 20 '21 at 17:09
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick May 20 '21 at 17:10
  • In your mapping.var I didn't find any match with the column names – akrun May 20 '21 at 20:30
  • Updated. mapping.var has a lot variables. – Stataq May 20 '21 at 20:59

1 Answers1

2

We could bind the datasets first and then split by 'SubjectID', loop over the list of datasets with imap and write to an excel file

library(dplyr)
library(purrr)
library(openxlsx)
datalist2 %>%
        bind_rows(.id = 'obj_name') %>%
        split(.$SubjectID)  %>%  
        imap(~ .x %>%
                split(.$obj_name) %>%
                 write.xlsx(.x, file.path(output_dir_path, paste0(.y, ".xlsx"))))
        
akrun
  • 874,273
  • 37
  • 540
  • 662
  • datalist2 is a list, how can I bind it through the dfs in the list. I know how to do it on df, but not sure how to it on list – Stataq May 20 '21 at 17:13
  • @Stataq I assume `datalist2` as a list of data.frame. So, it is binded with `bind_rows` to a single data.frame – akrun May 20 '21 at 17:13
  • the problem for this one is it will combine all variable into 1 tab, I would like the output fo be one df in 1 tab. also, I would like to bind variable label to the variables that are in each tab. – Stataq May 20 '21 at 17:19
  • @Stataq did you meant a sheet in exccel – akrun May 20 '21 at 17:22
  • Yes, a sheet. My ideal ouput is one excel file with multiple sheets for each ID. – Stataq May 20 '21 at 18:21
  • @Stataq can you try the updated. Here, after the `split`, we have a named list on which we are writing the write.xlsx. So each sheet will be named by unique SubjectID – akrun May 20 '21 at 18:24
  • @Stataq can you please test – akrun May 20 '21 at 18:36
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/232660/discussion-between-stataq-and-akrun). – Stataq May 20 '21 at 19:28