I have a dataframe with 2 columns. Can I transform one column to rows and keep the other column?

Question

I looked around the site, but I could not find this specific issue. I'm trying to prepare my dataset for further analysis, but I can't seem to fix something.

I have a list of Players and the club they currently play at:

PlayerID PlayerName        CurrentClub
1        Roland Alberg     ADO Den Haag
2        John Goossens     Feyenoord
3        Michael de Leeuw  Feyenoord
4        Kenny van der Weg NAC Breda
5        Alex Schalk       NAC Breda

Where I want to get is:

NewID CurrentClub       Player1             Player2

1     ADO Den Haag      Roland Alberg       NA
2     Feyenoord         John Goossens       Michael de Leeuw
3     NAC Breda         Kenny van der Weg   Alex Schalk

I've tried various methods with melt, group_by and transpose, but I never got it this result.

Does anybody know how to do this?

So I tried your suggestion and did this: Test <- spread(dataset, CurrentClub, PlayerName, fill = NA, convert = FALSE, drop = TRUE, sep = NULL) And got the following error message: Error: Duplicate identifiers for rows (144, 567, 945, 1257, 1753, 2167), (189, 680, 1026), (90, 683, 882, 1714), (91, 507, 1577, 1715), (278, 733, 1192), (7, 608), — BdJ, Oct 30 '18 at 17:23
@camille I don't think so as this situation is a lot more complex than the other one (the other one is a simple spread, this one is not) — prosoitos, Oct 30 '18 at 17:31
Start with this extended version `reshape(df, idvar = "CurrentClub", timevar = "PlayerID", direction = "wide")` — nghauran, Oct 30 '18 at 17:49

score 3 · Accepted Answer · answered Oct 30 '18 at 17:31

3

A combination of row_number and group_by should do the trick. Here is my solution:

df <- tibble(PlayerID = c(1,2,3,4,5),
   PlayerName = c("Alberg", "Goossens","Leeuw","van der Weg","Schalk"),
   CurrentClub = c("ADO Den Haag", "Feyenoord", "Feyenoord", "NAC Breda", "NAC Breda"))

 new_df <- df %>% group_by(CurrentClub) %>% select(-PlayerID) %>%
 mutate(player_number = paste0("Player ",row_number())) %>%
 spread(player_number, PlayerName)

 new_df

answered Oct 30 '18 at 17:31

Henry Cyranka

2,970
1
16
21

2

Oh, I see you suggested basically the same solution as me. row_number is a good call. Took me a while to realize that playerID was preventing from my solution to work... :) – iod Oct 30 '18 at 17:35
1

Bingo! Thanks a bunch Harro. – BdJ Oct 30 '18 at 18:09

score 0 · Answer 2 · answered Oct 30 '18 at 17:33

0

df[,-1] %>% 
group_by(CurrentClub) %>% 
mutate(Player=seq(1:n())) %>% 
spread(Player, PlayerName, sep="")

answered Oct 30 '18 at 17:33

iod

7,412
2
17
36

I have a dataframe with 2 columns. Can I transform one column to rows and keep the other column?

2 Answers2