Good evening guys,I want to combine the same id row into one row,with additional column,and here is my part of data.
sample=structure(list(crsp_fundno = c(18021, 18021, 18021, 18021, 22436,
22436, 22436, 22436, 22436, 22436, 49805, 49805, 49805, 55603,
55603, 93362), seq = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L, 6L,
1L, 2L, 3L, 1L, 2L, 1L), begdt = structure(c(13513, 14298, 15027,
16149, 12417, 13969, 14910, 14918, 15042, 15644, 14782, 14910,
15544, 15505, 15531, 17571), class = "Date"), enddt = structure(c(14297,
15026, 16148, 17621, 13968, 14909, 14917, 15041, 15643, 17621,
14909, 15543, 17621, 15530, 17621, 17621), class = "Date"), crsp_obj_cd = c("EDYG",
"EDYG", "EDYG", "EDYG", "EDYG", "EDYG", "EDYG", "EDYG", "EDYG",
"EDYG", "EF", "EF", "EF", "EDYB", "EDYB", "M"), lipper_class = c("MLGE",
"MCCE", "MCVE", "MLCE", "MLVE", "MLVE", "MLCE", "MLVE", "MLCE",
"MLVE", "IMLC", "IMLG", "IMLC", "MTAM", "MTAC", "MATJ"), lipper_obj_cd = c("G",
"G", "G", "G", "G", "G", "G", "G", "G", "G", "IF", "IF", "IF",
"GI", "GI", "I"), lipper_asset_cd = c("EQ", "EQ", "EQ", "EQ",
"EQ", "EQ", "EQ", "EQ", "EQ", "EQ", "EQ", "EQ", "EQ", "EQ", "EQ",
"EQ")), class = "data.frame", row.names = c(NA, -16L))
I tried to merge row which has the same ID in one row, and here is my code.
temp=list()
dn=unique(sample$crsp_fundno)
for(i in 1:length(dn) ){
part=sample[which(sample$crsp_fundno %in% dn[i]),]
part=reshape(part,idvar='crsp_fundno',timevar='seq',direction='wide')
temp[[i]]=part
}
library(plyr)
sum=rbind.fill(temp[[1]],temp[[2]])
for (i in 3 :length(dn)){sum=rbind.fill(sum,temp[[i]])}
Code works ,but too low in my whole data(94000 obs almost take 2 hours).
I think I should not heavily rely on for loop in large data set.
May anyone know how can I improve the code or my logic ?
Thanks for your help.