1

I'm trying to unnest a species count tibble so I can turn it into a dataframe. The last four columns are species counts and are currently in 'list' form (seems nested?). I'd like to have each column and row for these last four species columns to contain a count number 'int' (and not contain NULL but 0 in cases where no species were found on the transect)

I applied:

        'df %>% unnest(c(cols))' 

and got this error:

Error in `fn()`:
! In row 1, can't recycle input of size 4 to size 9.

Here is a much shortened version of the dataset in dput() form! Thanks to anyone who can help!

structure(list(Year = c(2019L, 2019L, 2019L, 2019L), Location = c("Tela", 
"Tela", "Tela", "Tela"), Site = c("AD", "AD", "AD", "AD"), Depth = c(10L, 
10L, 10L, 10L), Transect = 1:4, ID = c("2019_Tela_AD_1_10", "2019_Tela_AD_2_10", 
"2019_Tela_AD_3_10", "2019_Tela_AD_4_10"), `Stegastes planifrons` = list(
"1", NULL, NULL, c("10", "10", "10", "10", "10", "10", "10", 
"10", "10", "10")), `Anisotremus virginicus` = list(c("4", 
"4", "4", "4"), "1", NULL, NULL), `Stegastes adustus` = list(
c("9", "9", "9", "9", "9", "9", "9", "9", "9"), c("10", "10", 
"10", "10", "10", "10", "10", "10", "10", "10"), c("15", 
"15", "15", "15", "15", "15", "15", "15", "15", "15", "15", 
"15", "15", "15", "15"), c("14", "14", "14", "14", "14", 
"14", "14", "14", "14", "14", "14", "14", "14", "14")), `Stegastes partitus` = list(
c("9", "9", "9", "9", "9", "9", "9", "9", "9"), "1", c("14", 
"14", "14", "14", "14", "14", "14", "14", "14", "14", "14", 
"14", "14", "14"), c("10", "10", "10", "10", "10", "10", 
"10", "10", "10", "10"))), row.names = c(NA, -4L), class = c("tbl_df", 
"tbl", "data.frame"))
  • I'd suggest there is an issue with the upstream counting but regardless, you can do `dat %>% mutate(across(c(starts_with("Stegastes"), \`Anisotremus virginicus\`), lengths))`. – Ritchie Sacramento Apr 07 '22 at 12:04

2 Answers2

1

You can use apply for this:


cols_count <- colnames(tibble)[7:10] # select relevant cols

#Overwrite relevant parts of the dataframe
tibble[cols_count] <- apply(tibble[cols_count],
                            c(1,2), # go into every cell 
                            function(x) length(# get length of
                              unlist(x, recursive = FALSE) # a tibble cell is 
                              # a list itself, therefore unlist first
                            )
)# Apply function over relevant cells

which results in:

> tibble
# A tibble: 4 x 10
   Year Location Site  Depth Transect ID                `Stegastes planifrons` `Anisotremus virginicus` `Stegastes adustus` `Stegastes partitus`
  <int> <chr>    <chr> <int>    <int> <chr>                              <int>                    <int>               <int>                <int>
1  2019 Tela     AD       10        1 2019_Tela_AD_1_10                      1                        4                   9                    9
2  2019 Tela     AD       10        2 2019_Tela_AD_2_10                      0                        1                  10                    1
3  2019 Tela     AD       10        3 2019_Tela_AD_3_10                      0                        0                  15                   14
4  2019 Tela     AD       10        4 2019_Tela_AD_4_10                     10                        0                  14                   10
Sandwichnick
  • 1,379
  • 6
  • 13
  • Hi There @Sandwichnick thanks this helped. I have just tried to apply it to a different dataset and it is somehow putting out the same numbers it did when I applied it to this dataset. Very strange. Any ideas? – Bill_marinestats98 Apr 12 '22 at 13:49
  • have you overwritten all instances of the variable `tibble` with the name of the new dataset in my answer? maybe you forgot to change it on the right side of the line: `tibble[cols_count] <- apply(tibble[cols_count],`. For a quick check you can delete the variable `tibble` with `rm(tibble)` before trying with the other dataset. – Sandwichnick Apr 12 '22 at 14:45
0

There are probably better ways, but maybe this may help you:

library(purrr)
library(dplyr)

list_df <- df%>% 
  select(`Stegastes planifrons`, `Anisotremus virginicus`, `Stegastes adustus`, 
         `Stegastes partitus`) %>% 
  map_depth(., 2, ~ifelse(is.null(.x), 0, .x)) %>% 
  map_df(unlist) %>% 
  mutate_if(is.character, as.numeric)

df %>% 
  select(Year:ID) %>% 
  cbind(list_df)
Julian
  • 6,586
  • 2
  • 9
  • 33