2

I have a data frame that looks like this:

    ID    Club Type
1   1   arsenal 18
2   1   arsenal 31
3   1   arsenal 32
4   1   arsenal 14
5   2   chelsea 14
6   2   chelsea 31
7   2   chelsea 15
8   2   chelsea 32
9   3   fulham  27
10  3   fulham  31

I want to transform it into something like this:

    ID    Club  14 15 18 27 31 32
1   1   arsenal  1  0  1  0  1  1
2   2   chelsea  1  1  0  0  1  1
3   3   fulham   0  0  0  1  1  0

So in the transformed data frame, if a type matches with a particular club then it is given a 1 value and a 0 value otherwise.

My data frame is quite a bit bigger than this (otherwise I'd be doing them all manually).

Can someone give me a suggestion as to how I'd go about doing this? My problems stem from the fact that:

  • types appear multiple times
  • I'm not sure how I'd go about inserting binary values (I realise I could probably do something that returns True/False)
  • the ID elements should collapse into one single element
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
user1988898
  • 185
  • 2
  • 9
  • 1
    Take a look at `cast` or `dcast` functions in the `reshape` or `reshape2` packages. I'm sure you can find many examples of going from "long format" to "wide format" in R. – Matt Weller Jan 27 '14 at 01:40
  • I wasn't aware of either of those packages - I shall take a look at them now. thank you for suggesting them! – user1988898 Jan 27 '14 at 01:43

2 Answers2

3
library(reshape2) 

df <- read.table("clipboard",header=T)

dcast(df, ID + Club ~ Type, length)

  ID    Club 14 15 18 27 31 32
1  1 arsenal  1  0  1  0  1  1
2  2 chelsea  1  1  0  0  1  1
3  3  fulham  0  0  0  1  1  0 
Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255
2

The reshape alternative in R is not very pretty, but you can try:

x <- data.frame(table(mydf[-1]))          # Drop "ID" when tabulating
x$Freq <- as.numeric(as.logical(x$Freq))  # In case any are > 1
y <- reshape(x, direction = "wide", idvar="Club", timevar="Type")
merge(unique(mydf[1:2]), y)               # Reintroduce the ID
#      Club ID Freq.14 Freq.15 Freq.18 Freq.27 Freq.31 Freq.32
# 1 arsenal  1       1       0       1       0       1       1
# 2 chelsea  2       1       1       0       0       1       1
# 3  fulham  3       0       0       0       1       1       0
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485