-1

I have data:

rowID    incidentID participant.type
1          1                A
2          1                B
3          2                A
4          3                A
5          3                B
6          3                C
7          4                B
8          4                C

And I would like to end up with:

   rowID incident participant.type participant.type.1 participant.type.2
    1        1                A                  B                   
    2        2                A                                      
    3        3                A                  B                  C
    4        4                B                  C      

I tried the spread but can't achieve one line per incident; I don't think I have a way of creating a key-value pair so I wonder if there is some other method for doing this.

user3387656
  • 93
  • 1
  • 1
  • 7

5 Answers5

2

Before using spread(), you need to create a proper key argument.

df %>% select(-rowID) %>%
       group_by(incidentID) %>%
       mutate(id = 1:n()) %>%
       spread(id, participant.type)

#   incidentID  `1`   `2`   `3`  
#        <int>  <fct> <fct> <fct>
# 1          1  A     B     NA   
# 2          2  A     NA    NA   
# 3          3  A     B     C    
# 4          4  B     C     NA 
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
1

Since your grouping is based on the row order within the icidentID column. The following simple solution will also work.

It is just filtering the dataframe and then merging in the end.

It is probably not the best solution in terms of effective use of computing power, but it is easy to understand.

library(tidyverse)

df <- 
  tribble(
    ~rowID,    ~incidentID, ~participant.type,
    1,          1,                "A",
    2,          1,                "B",
    3,          2,                "A",
    4,          3,                "A",
    5,          3,                "B",
    6,          3,                "C",
    7,          4,                "B",
    8,          4,                "C")

df_1 <- df %>%
  select(-rowID) %>% 
  group_by(incidentID) %>% 
  filter(row_number()==1)


df_2 <- df %>%
  select(-rowID) %>% 
  group_by(incidentID) %>% 
  filter(row_number()==2) %>% 
  rename(participant.type.1 = participant.type)


df_3 <- df %>%
  select(-rowID) %>% 
  group_by(incidentID) %>% 
  filter(row_number()==3) %>% 
  rename(participant.type.2 = participant.type) 

full_join(df_1, full_join(df_2, df_3))

Result:

Joining, by = "incidentID"
Joining, by = "incidentID"
# A tibble: 4 x 4
# Groups:   incidentID [?]
  incidentID participant.type participant.type.1 participant.type.2
       <dbl> <chr>            <chr>              <chr>             
1          1 A                B                  NA                
2          2 A                NA                 NA                
3          3 A                B                  C                 
4          4 B                C                  NA    
Steen Harsted
  • 1,802
  • 2
  • 21
  • 34
0

Here's my solution:

df %>%
  select(-rowID) %>%
  group_by(incidentID) %>%
  nest() %>%
  mutate(data = map_chr(data, ~str_c(.x$participant.type, collapse = '_'))) %>%
  separate(data, paste0('participant.type.', 0:2)) %>%
  mutate_at(2:4, ~replace_na(.x, ''))
Paweł Chabros
  • 2,349
  • 1
  • 9
  • 12
0

We can use reshape2::dcast for this

reshape2::dcast(df, insidentID ~ participant.type)    
  #   insidentID    A    B    C
  # 1          1 <NA>    B <NA>
  # 2          8 <NA>    B <NA>
  # 3         12 <NA> <NA>    C
  # 4         16    A <NA> <NA>
  # 5         24 <NA>    B <NA>
  # 6         27 <NA>    B    C
  # 7         29 <NA> <NA>    C

with the data

set.seed(123)
df <- data.frame(insidentID = sample(0:30, 8L, replace = TRUE),
                 participant.type = sample(LETTERS[1:3], 8L, replace = TRUE),
                 stringsAsFactors = FALSE)
df
#   insidentID participant.type
# 1          8                B
# 2         24                B
# 3         12                C
# 4         27                B
# 5         29                C
# 6          1                B
# 7         16                A
# 8         27                C
niko
  • 5,253
  • 1
  • 12
  • 32
0

The 'related question' link provided by @markus shows a variety of other solutions, including what appears to be the most concise in a tidyverse format:

 df %>% 
  group_by(incidentID) %>%
  mutate(rn = paste0("newcolumn",row_number()))  %>%
  spread(rn, participant.type)

gives:

incidentID newcolumn1 newcolumn2 newcolumn3
       <int> <fct>      <fct>      <fct>     
1          1 A          B          NA        
2          2 A          NA         NA        
3          3 A          B          C         
4          4 B          C          NA

A

user3387656
  • 93
  • 1
  • 1
  • 7
  • If you don't remove the variable `rowID`, the output won't be what you show above. And your solution is equivalent to mine. You don't need to post it again. – Darren Tsai Jan 13 '19 at 12:03
  • I appreciate your help, I think I posted the code based on the link before I saw yours. I've marked yours as the solution. – user3387656 Jan 13 '19 at 15:52