I have a dataframe in R that I pulled from a database of bacterial growth conditions. The dataframe is quite large (~90k rows) and each row corresponds to a bacterial species form said database.
The issue here is that for each row I have a nested list of items. For example:
`
test_data[[1]][["Safety information"]]
.. .. ..$ DSM-Number : int 4491
.. .. ..$ keywords :List of 4
.. .. .. ..$ : chr "Bacteria"
.. .. .. ..$ : chr "16S sequence"
.. .. .. ..$ : chr "genome sequence"
.. .. .. ..$ : chr "mesophilic"
.. .. ..$ description : chr "Acetobacter lovaniensis DSM 4491 is a mesophilic bacterium that was isolated from soil."
.. .. ..$ NCBI tax id :List of 2
.. .. .. ..$ NCBI tax id : int 104100
.. .. .. ..$ Matching level: chr "species"
.. .. ..$ strain history:List of 2
.. .. .. ..$ : chr "<- NCIMB <- W. Verhoeven <- J. Frateur"
.. .. .. ..$ : chr "DSM 4491 <-- NCIMB 8620 <-- W. Verhoeven L 1024 <-- J. Frateur."
.. .. ..$ doi : chr "10.13145/bacdive9.20220920.7"
.. ..$ Name and taxonomic classification :List of 11
.. .. ..$ LPSN :List of 12
.. .. .. ..$ @ref : int 20215
.. .. .. ..$ description : chr "domain/bacteria"
.. .. .. ..$ keyword : chr "phylum/pseudomonadota"
.. .. .. ..$ domain : chr "Bacteria"
.. .. .. ..$ phylum : chr "Pseudomonadota"
.. .. .. ..$ class : chr "Alphaproteobacteria"
.. .. .. ..$ order : chr "Rhodospirillales"
.. .. .. ..$ family : chr "Acetobacteraceae"
.. .. .. ..$ genus : chr "Acetobacter"
.. .. .. ..$ species : chr "Acetobacter lovaniensis"
.. .. .. ..$ full scientific name: chr "<I>Acetobacter</I> <I>lovaniensis</I> (Frateur 1950) Lisdiyanti et al. 2001"
.. .. .. ..$ synonyms :List of 2
.. .. .. .. ..$ :List of 2
.. .. .. .. .. ..$ @ref : int 20215
.. .. .. .. .. ..$ synonym: chr "Acetobacter pasteurianus subsp. lovaniensis"
.. .. .. .. ..$ :List of 2
.. .. .. .. .. ..$ @ref : int 20215
.. .. .. .. .. ..$ synonym: chr "Acetobacter lovaniense"
.. .. ..$ @ref : int 1703
.. .. ..$ domain : chr "Bacteria"
.. .. ..$ phylum : chr "Proteobacteria"
.. .. ..$ class : chr "Alphaproteobacteria"
.. .. ..$ order : chr "Rhizobiales"
.. .. ..$ family : chr "Acetobacteraceae"
.. .. ..$ genus : chr "Acetobacter"
.. .. ..$ species : chr "Acetobacter lovaniensis"
.. .. ..$ full scientific name: chr "Acetobacter lovaniensis (Frateur 1950) Lisdiyanti et al. 2001"
.. .. ..$ type strain : chr "yes"
.. ..$ Morphology : Named list()
.. ..$ Culture and growth conditions :List of 2
.. .. ..$ culture medium:List of 2
.. .. .. ..$ :List of 5
.. .. .. .. ..$ @ref : int 1703
.. .. .. .. ..$ name : chr "YPM MEDIUM (DSMZ Medium 360)"
.. .. .. .. ..$ growth : chr "yes"
.. .. .. .. ..$ link : chr "https://bacmedia.dsmz.de/medium/360"
.. .. .. .. ..$ composition: chr "Name: YPM MEDIUM (DSMZ Medium 360)\nComposition:\nMannitol 25.0 g/l\nAgar 12.0 g/l\nYeast extract 5.0 g/l\nPept"| __truncated__
.. .. .. ..$ :List of 5
.. .. .. .. ..$ @ref : int 1703
.. .. .. .. ..$ name : chr "GLUCONOBACTER OXYDANS MEDIUM (DSMZ Medium 105)"
.. .. .. .. ..$ growth : chr "yes"
.. .. .. .. ..$ link : chr "https://bacmedia.dsmz.de/medium/105"
.. .. .. .. ..$ composition: chr "Name: GLUCONOBACTER OXYDANS MEDIUM (DSMZ Medium 105)\nComposition:\nGlucose 100.0 g/l\nCaCO3 20.0 g/l\nAgar 15."| __truncated__
.. .. ..$ culture temp :List of 2
.. .. .. ..$ :List of 5
.. .. .. .. ..$ @ref : int 1703
.. .. .. .. ..$ growth : chr "positive"
.. .. .. .. ..$ type : chr "growth"
.. .. .. .. ..$ temperature: chr "28"
.. .. .. .. ..$ range : chr "mesophilic"
.. .. .. ..$ :List of 5
.. .. .. .. ..$ @ref : int 67770
.. .. .. .. ..$ growth : chr "positive"
.. .. .. .. ..$ type : chr "growth"
.. .. .. .. ..$ temperature: chr "28"
.. .. .. .. ..$ range : chr "mesophilic"
`
I would like to essentially 'inflate' the lists to be columns, but I'm unsure how to go about doing this with nested lists and with the lists being nested.
I am unsure where to head. I've tried things from Tidyverse, but doesn't seem to be working. Here is a test sample of the data:
https://github.com/pattyjk/pullIng_bacdive_data/blob/main/test_data.rds