R Data Rearrange

Question

Please let me know 'R code' that could rearrange data from

AA 100 NA
BB 200 300
CC 300 NA
DD 100 400

to

AA 100 0   0   0
BB 0   200 300 0
CC 0   0   300 0
DD 100 0   0   400

OR

   100 200 300 400
AA 1   0   0   0
BB 0   1   1   0
CC 0   0   1   0
DD 1   0   0   1

It would be helpful to have a [reproducible example](http://stackoverflow.com/q/5963269) and to know what you've tried so far. — BenBarnes, Dec 11 '12 at 06:43
If your data are in a data.frame named `df`, then `table(data.frame(df[,1], unlist(df[,-1])))` will do the trick. — Josh O'Brien, Dec 11 '12 at 06:46
@JoshO'Brien make that an answer? myDF <- data.frame( V1 = c("AA", "BB", "CC", "DD") , V2 = c(100L, 200L, 300L, 100L) , V3 = c(NA, 300L, NA, 400L) ) table(data.frame(myDF[,1], unlist(myDF[,-1]))) — Anthony Damico, Dec 11 '12 at 07:22
@JoshO'Brien, I like your method a lot more. Please go ahead and post that as an answer — Ricardo Saporta, Dec 11 '12 at 07:24
@AnthonyDamico and Ricardo -- OK, just posted it. Feel free to edit/add explanation to my answer if you like -- I thought it a bit opaque without any explanation, and was feeling a bit lazy myself... — Josh O'Brien, Dec 11 '12 at 07:34
Thank you for your answer. I solved problem. I'm not good at R code now. But I'll try it. — 임상혁, Dec 11 '12 at 08:14

score 6 · Answer 1 · answered Dec 11 '12 at 07:27

df <- read.table(text = "AA 100 NA
BB 200 300
CC 300 NA
DD 100 400")

table(data.frame(letters = df[,1], numbers = unlist(df[,-1])))
#        numbers
# letters 100 200 300 400
#      AA   1   0   0   0
#      BB   0   1   1   0
#      CC   0   0   1   0
#      DD   1   0   0   1

Ricardo Saporta · Answer 2 · 2012-12-11T07:13:20.873

# SAMPLE DATA
myDF <- structure(list(V2 = c(100L, 200L, 300L, 100L), V3 = c(NA, 300L, NA, 400L)), .Names = c("V2", "V3"), class = "data.frame", row.names = c("AA", "BB", "CC", "DD"))

Assuming myDf is your original data frame

# create columns sequence
Columns <- seq(100, 400, by=100)

newMat <- sapply(Columns, function(c) rowSums(c==myDF, na.rm=T))

# assign names
colnames(newMat) <- Columns

newMat  
#      100 200 300 400
#   AA   1   0   0   0
#   BB   0   1   1   0
#   CC   0   0   1   0
#   DD   1   0   0   1

Explanation:

c == myDF gives a matrix of TRUE/FALSE values.
If you perform arithmetic on T/F, they are treated as 1/0
Thus, we can take the rowSum() for each row AA, BB, etc.
which will tell us how many times each row is equal to c.

We use sapply to iterate over each column value, 100, 200, etc.
lapply returns for us a list
sapply, takes that list and simplifies it into a nice matrix.

we then clean up the names to make things pretty.

score 0 · Answer 3 · answered Dec 12 '12 at 05:58

To get the values, one could also use the reshape2 package:

DF <- read.table(text = "AA 100 NA
 BB 200 300
 CC 300 NA
 DD 100 400")

library(reshape2)
dfm <- melt(DF, id = "V1")

dcast(dfm, V1 ~ factor(value), fill = 0)[, -6]
  V1 100 200 300 400
1 AA 100   0   0   0
2 BB   0 200 300   0
3 CC   0   0 300   0
4 DD 100   0   0 400

The last column in dcast() is removed because NA is a value in dfm$value and takes up the last column in the cast data frame.

R Data Rearrange

3 Answers3

Explanation:

Linked