3

I have the table like this:

  A   B    C
1 a   b'   1
2 a   b'   2
3 a   b''  3
4 a2  b1   1
5 a3  b2   3

For example: Column A - is a genus (like E.) and column B is a species (E. coli) Column C is a class of the item (doesn't metter)

So I need to understand, how many classes item b consist of:

  A  B   C1 C2 C3
1 a  b'  +  +  -
2 a  b'' -  -  +
3 a2 b1  +  -  -
4 a3 b2  -  -  +

  • Sorry, I did not understand the logic behind C1, C2 and C3 – Vinícius Félix Sep 15 '21 at 16:19
  • Unless you can add more detail, and explain what the `+` and `-` means, I think this will be closed as a duplicate of [how to pivot data from long to wide format](https://stackoverflow.com/q/5890584/903061) – Gregor Thomas Sep 15 '21 at 16:21
  • Column C is just a column which correspond to class of the object (in my research) So, I'm asking about how ti rearrange this column into set of. – Ilya Putilin Sep 15 '21 at 16:24

3 Answers3

2

Using tidyverse

df <- data.frame(A = c('a','a','a','a2','a3'), 
                 B = c("b'", "b'", "b''", "b1", "b2"),
                 C = c(1,2,3,1,3))

df %>%
  mutate( C = paste("C",C, sep = ""),
          D = "+") %>%
  pivot_wider(names_from = C, values_from = D, values_fill = "-")

First I create the data.frame, and then I actually create the C1,C2,C3. You requested a +/- system, so I put in a + where it is present so when I pivot_wider. Now pivot_wider will end up with + and NA values; so to correct for it, use the values_fill = option and set that to -.

# A tibble: 4 x 5
  A     B     C1    C2    C3   
  <chr> <chr> <chr> <chr> <chr>
1 a     b'    +     +     -    
2 a     b''   -     -     +    
3 a2    b1    +     -     -    
4 a3    b2    -     -     +    
akash87
  • 3,876
  • 3
  • 14
  • 30
1

Using dcast

library(data.table)
dcast(setDT(df)[, tmp := "+"], A + B ~ paste0("C", C), value.var = "tmp", fill = "-")
    A   B C1 C2 C3
1:  a  b'  +  +  -
2:  a b''  -  -  +
3: a2  b1  +  -  -
4: a3  b2  -  -  +
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Update: A more compact version generated with the help of akrun:

df %>%      
  mutate(C = as.factor(C)) %>%
  bind_cols((map(setNames(levels(.$C), paste0("C", levels(.$C))), function(x) ifelse(x == .$C, "+", "-")))) %>% 
  group_by(A, B) %>%  summarise(across(everything(), ~ first(.[order(. != '+')])), .groups = 'drop')

Here is another more complicated solution: Was generated with the help of fabulous akrun Combine, merge, coalesce rows by group and replace certain value by another value without pivoting

library(tidyverse)
df %>% 
    mutate(C = as.factor(C)) %>% 
    cbind(sapply(levels(.$C), `==`, .$C)) %>% 
    rename_with(.cols = 4:6, ~ paste0("C", .x)) %>% 
    mutate(across(C1:C3, ~ifelse(.==TRUE,"+", "-"))) %>% 
    select(-C) %>% 
    group_by(A, B) %>% 
    summarise(across(everything(), 
                     ~ first(.[order(. != '+')])), .groups = 'drop')

output:

  A     B     C1    C2    C3   
  <chr> <chr> <chr> <chr> <chr>
1 a     b'    +     +     -    
2 a     b''   -     -     +    
3 a2    b1    +     -     -    
4 a3    b2    -     -     +  
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    Or may be a bit more compact `df %>% mutate(C = as.factor(C)) %>% bind_cols((map(setNames(levels(.$C), paste0("C", levels(.$C))), function(x) ifelse(x == .$C, "+", "-")))) %>% group_by(A, B) %>% summarise(across(everything(), ~ first(.[order(. != '+')])), .groups = 'drop')` – akrun Sep 15 '21 at 19:16
  • 1
    Wonderful. I have added to the answer! – TarJae Sep 15 '21 at 19:22