How to transpose column and group at the same time

Question

I have a dataset as follows:

Pt    EVENT
123    GGG
123    Nor
123    tre
144    GGG
1667   tre
1667   Nor
1667   tre

I am trying to prepare the data eventually for a Sankey diagram and in order to do this I need to get the data into the following shape

Pt    
123   GGG      Nor   tre
144   GGG
1667  tre      Nor   tre

And then eventually I think into a source, target, value format as follows:

source    target   value
 GGG        Nor       1
 GGG                  1
 tre        tre       1
 Nor        tre       2

The part I don't understand is how to get from the original dataset to the second one. I thought I might be able to do it with dplyr but no joy:

  Sankey<-EndoSubset %>%
      group_by(Pt) %>% 
      select(t(EVENT))

There is a subtle difference between this question and the link to the proposed duplicate. The duplicate addresses the case were there is a "key" column and a "value" column (also an "id" column, `Pt` in this case). This question doesn't have a "key" column, just a "value" column; at least from the original data frame to the first transformation (the "key" column would be the group row number). The answer from "akrun" addresses this. I think there is likely a different SO post that addresses this (I haven't found it though). — steveb, Jul 10 '16 at 18:48

score 3 · Answer 1 · answered Jul 10 '16 at 18:56

This can be done with reshape() by synthesizing a time column:

reshape(cbind(df,time=ave(seq_len(nrow(df)),df$Pt,FUN=seq_along)),dir='w',idvar='Pt');
##     Pt EVENT.1 EVENT.2 EVENT.3
## 1  123     GGG     Nor     tre
## 4  144     GGG    <NA>    <NA>
## 5 1667     tre     Nor     tre

Data

df <- data.frame(Pt=c(123L,123L,123L,144L,1667L,1667L,1667L),EVENT=c('GGG','Nor','tre','GGG',
'tre','Nor','tre'),stringsAsFactors=F);

akrun · Accepted Answer · 2016-07-10T19:00:10.877

2

We can use data.table

 library(data.table)
 dcast(setDT(df1), Pt~rowid(Pt), value.var="EVENT")

edited Jul 10 '16 at 19:00

answered Jul 10 '16 at 18:32

akrun

874,273
37
540
662

1

Converting to wide just aggregates the values for each Pt. akrun's answer is what I wanted. WIll accept in a minute when SO allows me to – Sebastian Zeki Jul 10 '16 at 18:35
1

A `dplyr` and `tidyr` solution that could work is `data %>% group_by(Pt) %>% mutate(rn = 1:n()) %>% ungroup %>% spread(rn, EVENT)` – steveb Jul 10 '16 at 18:51
@akrun Thanks for including this. I put it in as an answer but if that is not good SO etiquette (since it is now here), I will remove my answer. – steveb Jul 10 '16 at 18:59
@steveb I removed the answer from my post. – akrun Jul 10 '16 at 19:00
@akrun Thanks. Given the chain of events, I will assume all is good w.r.t. making it an answer :-) – steveb Jul 10 '16 at 19:01

score 2 · Answer 3 · answered Jul 10 '16 at 18:58

2

Here is a dplyr and tidyr solution:

library(dplyr)
library(tidyr)

data %>%
     group_by(Pt) %>%
     mutate(rn = 1:n()) %>%
     ungroup %>%
     spread(rn, EVENT)

answered Jul 10 '16 at 18:58

steveb

5,382
2
27
36

Thanks. Not sure from the conversation whose answer I should accept as they both work. – Sebastian Zeki Jul 10 '16 at 19:44

score 0 · Answer 4 · answered Jul 10 '16 at 20:31

Another option:

library(data.table)
l <- sapply(unique(df$Pt), function(x) data.frame(rbind(c(x,df[df$Pt==x,]$EVENT))))
rbindlist(l, fill = T)

     # X1  X2  X3  X4
# 1:  123 GGG Nor tre
# 2:  144 GGG  NA  NA
# 3: 1667 tre Nor tre

DATA

df <- structure(list(Pt = c(123L, 123L, 123L, 144L, 1667L, 1667L, 1667L
), EVENT = c("GGG", "Nor", "tre", "GGG", "tre", "Nor", "tre")), .Names = c("Pt", 
"EVENT"), row.names = c(NA, -7L), class = "data.frame")

How to transpose column and group at the same time

4 Answers4

Linked

Related