1

give a brief example. I have data frame data1.

name<-c("John","John","Mike","Amy".....) 
nationality<-c("Canada","America","Spain","Japan".....)
data1<-data.frame(name,nationality....)

which mean the people is from different countries each people is specialize by his name and country, and no repeat.

the second data frame is

name2<-c("John","John","Mike","John",......)
nationality2<-c("Canada","Canada","Canada".....)
score<-c(87,67,98,78,56......)
data2<-data.frame(name2,nationality2,score)

every people is promised to have 5 rows in data2, which means they have 5 scores but they are in random order.

what I want to do is to know every person's 5 scores, but I didn't care what his name is and where he is from.

the final data frame I want to have is

   score1   score2  score3  score4   score5
1    89        89       87     78        90
2    ...
3    ...

every row represent one person 5 scores but I don't care who he is. my data number is so large so I can not use for function. what can I do?

Bea
  • 1,110
  • 12
  • 20
nan
  • 401
  • 4
  • 13

2 Answers2

2

Although there is an already accepted answer which uses base R I would like to suggest a solution which uses the convenient dcast() function for reshaping from wide to long form instead of using tapply() and repeated calls to rbind():

library(data.table)   # CRAN version 1.10.4 used
dcast(setDT(data2)[setDT(data1), on = c(name2 = "name", nationality2 = "nationality")],
      name2 + nationality2 ~ paste0("score", rowid(rleid(name2, nationality2))),
                                    value.var = "score")

returns

   name2 nationality2 score1 score2 score3 score4 score5
1:   Amy       Canada     93     91     73      8     79
2:  John      America      3     77     69     89     31
3:  Mike       Canada     76     92     46     47     75
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • great thanks. Another example for learning data.table. It seems that this package is quietly powerful in dealing with data.frame – nan Jun 13 '17 at 02:22
0

It seems to me that's what you're asking:

data1 <- data.frame(name  = c("John","Mike","Amy"),
                nationality = c("America","Canada","Canada"))

data2 <- data.frame(name2 = rep(c("John","Mike","Amy","Jack","John"),each = 5),
                    score = sample(100,25), nationality2 =rep(c("America","Canada","Canada","Canada","Canada"),each = 5))

data3 <- merge(data2,data1,by.x=c("name2","nationality2"),by.y=c("name","nationality"))
data3$name_country <- paste(data3$name2,data3$nationality2)
all_scores_list <- tapply(data3$score,data3$name_country,c)
as.data.frame(do.call(rbind,all_scores_list))

# V1 V2 V3 V4 V5
# Amy Canada   57 69 90 81 50
# John America  4 92 75 15  2
# Mike Canada  25 86 51 20 12
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • Thanks for your answer. – nan Jun 09 '17 at 15:01
  • but, actually John could come from America or Canada, and I am interested in John from America – nan Jun 09 '17 at 15:01
  • I used merge instead of subset %in%, that's more convenient for several columns, then created a name_country column to group on – moodymudskipper Jun 09 '17 at 15:14
  • That is what I want !!!!!!!!!!!!!!!!!!!Thank you very much! 感谢!BTW, how could I get familiar with advanced programming technique like this one. The book I could find in China is quite basic and simple. – nan Jun 09 '17 at 15:21
  • The official documentation is usually very good, so 1st step is to learn not to get scared by it and try the example you find when you execute `?merge` , `?subset` , `?match` , `?apply` `?do.call` etc. personally wouldn't like to learn R from books but that's just me. lookup functions in stack overflow will also teach you a lot. And recently I found this cool link too: https://stackoverflow.com/questions/1295955/what-is-the-most-useful-r-trick . I can explain more in the answer if needed, and you can validate it if you're happy with it – moodymudskipper Jun 09 '17 at 15:43
  • would you mind helping me with anther question? I have post it – nan Jun 11 '17 at 03:22
  • oh, I don't know how to accept. I don't even know I should accept an answer. Sorry. – nan Jun 11 '17 at 04:59
  • I have accepted your answer. Apology. The arrow is quite confusing – nan Jun 11 '17 at 05:08
  • https://stackoverflow.com/questions/44468974/r-how-to-get-data-according-to-stock-code-and-dateavoid-for-function. this is the link. – nan Jun 11 '17 at 14:39