1

I want the example_df in a wide format where each row represents a cell type, and where each column represents a receptor.

So far, using pivot_wider(), I created a dataframe called example_df_wider... But I want to change <dbl [2]> to 2, and to 0.

I'm quite new to R and programming... So, how can I do that?

Many thanks in advance!

example_df <- data.frame(cell_type=as.factor(c("cell_1","cell_1","cell_2","cell_2")),
           receptor = c("receptor_1","receptor_1", "receptor_2", "receptor_2"))
example_df

example_df_wider <- example_df %>% mutate(count=1) %>%
  pivot_wider(names_from = receptor, values_from = count) 
example_df_wider
moltie
  • 11
  • 2
  • 2
    Welcome to SO (and programming). Can you read [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and add a minimal reproducible example of you data? That will help others to help you. – markus Dec 22 '22 at 13:24
  • 1
    You can perhaps try `unnest` to transform the column into a numeric variable, but without knowing what your data is, what code got you there, and what you are trying to achieve, that's about all anyone can say here. Please do read the link that Markus shared if you want to get a helpful answer. – Allan Cameron Dec 22 '22 at 13:39
  • 1
    @markus & Allan Cameron, thank you for your suggestions! I tried to create a minimal reproducible example :) (and I also tried out unnest(), but without success ;) – moltie Dec 22 '22 at 14:18

1 Answers1

1

If you want to unlist the lists in the context of the dataframe you have to decide if you want them in a long or wide format, e.g.

Data

library(dplyr)
library(tidyr)

df <- tibble(CADM1 = c(list(c(12, 34, 2)), list(c(1, 2)),
        list(c(12, 34, 2, 33)), list(c(2))))

df
# A tibble: 4 × 1
  CADM1    
  <list>   
1 <dbl [3]>
2 <dbl [2]>
3 <dbl [4]>
4 <dbl [1]>

long format

df %>% 
  unnest(CADM1)
# A tibble: 10 × 1
   CADM1
   <dbl>
 1    12
 2    34
 3     2
 4     1
 5     2
 6    12
 7    34
 8     2
 9    33
10     2

wide format (needs to introduce NAs to pad the length of each list)

df %>% 
  unnest_wider(CADM1, names_sep="_")
# A tibble: 4 × 4
  CADM1_1 CADM1_2 CADM1_3 CADM1_4
    <dbl>   <dbl>   <dbl>   <dbl>
1      12      34       2      NA
2       1       2      NA      NA
3      12      34       2      33
4       2      NA      NA      NA

provided data

Using your example (I replaced NULL with NA because NULL brings some special behavior with it...)

example_df_wider$receptor_1[2] <- NA
example_df_wider$receptor_2[1] <- NA

example_df_wider
# A tibble: 2 × 3
  cell_type receptor_1 receptor_2
  <fct>     <list>     <list>    
1 cell_1    <dbl [2]>  <lgl [1]> 
2 cell_2    <lgl [1]>  <dbl [2]>

long format

example_df_wider %>% 
  unnest(starts_with("rec"))
# A tibble: 4 × 3
  cell_type receptor_1 receptor_2
  <fct>          <dbl>      <dbl>
1 cell_1             1         NA
2 cell_1             1         NA
3 cell_2            NA          1
4 cell_2            NA          1

wide format

example_df_wider %>% 
  unnest_wider(receptor_1, names_sep="_") %>% 
  unnest_wider(receptor_2, names_sep="_")
# A tibble: 2 × 5
  cell_type receptor_1_1 receptor_1_2 receptor_2_1 receptor_2_2
  <fct>            <dbl>        <dbl>        <dbl>        <dbl>
1 cell_1               1            1           NA           NA
2 cell_2              NA           NA            1            1

Sum the values

example_df_wider %>% 
  unnest(starts_with("rec")) %>% 
  group_by(cell_type) %>% 
  summarize(across(starts_with("rec"), ~ sum(.x, na.rm=T)))
# A tibble: 2 × 3
  cell_type receptor_1 receptor_2
  <fct>          <dbl>      <dbl>
1 cell_1             2          0
2 cell_2             0          2
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29
  • Big thanks, sir! How can I sum the receptor values, instead of creating new columns like receptor_1_2? So that I get a dataframe of two rows (cell_1 and cel_2), but only 2 columns (receptor_1 and receptor_2) where the first row of receptor_1 contains the value 2? – moltie Dec 22 '22 at 14:58
  • @moltie See edit. You can sum the values by grouping and passing them to `summarize`. – Andre Wildberg Dec 22 '22 at 15:15