1

I have some problems with converting my data from one table to another. At this moment I use 2 'for' loops and it is quiet time consuming for my dataset (>50,000 variables). In the output there is unique 'Name' with all information for the Name in the row. Any suggestions how to solve it?

My dataset:

 Date_Y Name Amount Score
2010    A    150   1.8
2011    A    120   1.2
2012    A    175   1.3
2010    B    160   1.9
2011    C    120   1.0
2012    C    110   2.0
2013    C    155   3.0

Target dataset:

Name Amount_2010 Amount_2011 Amount_2012 Amount_2013 Score_2010 Score_2011 Score_2012 Score_2013
 "A"  "150"       "120"       "175"       "NA"        "1.8"      "1.2"      "1.3"      "NA"      

 "B"  "160"       "NA"        "NA"        "NA"        "1.9"      "NA"       "NA"       "NA"      
 "C"  "NA"        "120"       "110"       "155"       "NA"       "1"        "2"        "3"  

My current code:

 rm(list = ls())
df <- data.frame(Date_Y=c(2010, 2011, 2012, 2010, 2011, 2012, 2013),
                 Name = c("A", "A", "A", "B", "C", "C", "C"), 
                 Amount = c(150, 120, 175, 160, 120, 110, 155), Score = c(1.8, 1.2, 1.3, 1.9, 1, 2.0, 3))

Name_List <-unique(as.character(df$Name))

new_df = matrix (ncol=9, nrow=0)
colnames(new_df)<-c("Name", "Amount_2010","Amount_2011","Amount_2012","Amount_2013","Score_2010","Score_2011","Score_2012","Score_2013")

My_dates <- c("2010", "2011", "2012", "2013")

for (i in 1:length(Name_List)){
    results <- rep("NA",8)
  for (j in 1:4)
  {
    ifelse(length(df[which(df$Name== Name_List[i] & df$Date_Y==My_dates[j]),3])>0,   results[j]<- df[which(df$Name== Name_List[i] & df$Date_Y==My_dates[j]),3], results[j]<- "NA")
    ifelse(length(df[which(df$Name== Name_List[i] & df$Date_Y==My_dates[j]),4])>0,   results[j+4]<- df[which(df$Name== Name_List[i] & df$Date_Y==My_dates[j]),4], results[j+4]<- "NA")
  }
      new_df =rbind(new_df, c((Name_List[i]), results))
}

new_df
Lohengrin
  • 115
  • 7
  • Or as described in the link: `reshape2::recast(d, id.var = c("Date_Y", "Name"), Name ~ variable + Date_Y)` – Henrik Jan 03 '19 at 10:58

1 Answers1

4

We may instead use the reshape2 package. Melting the data frame gives

(melted <- melt(df, c("Date_Y", "Name")))
#    Date_Y Name variable value
# 1    2010    A   Amount 150.0
# 2    2011    A   Amount 120.0
# 3    2012    A   Amount 175.0
# 4    2010    B   Amount 160.0
# 5    2011    C   Amount 120.0
# 6    2012    C   Amount 110.0
# 7    2013    C   Amount 155.0
# 8    2010    A    Score   1.8
# 9    2011    A    Score   1.2
# 10   2012    A    Score   1.3
# 11   2010    B    Score   1.9
# 12   2011    C    Score   1.0
# 13   2012    C    Score   2.0
# 14   2013    C    Score   3.0

so that now we may use dcast and get

dcast(melted, Name ~ variable + Date_Y)
#   Name Amount_2010 Amount_2011 Amount_2012 Amount_2013 Score_2010 Score_2011 Score_2012 Score_2013
# 1    A         150         120         175          NA        1.8        1.2        1.3         NA
# 2    B         160          NA          NA          NA        1.9         NA         NA         NA
# 3    C          NA         120         110         155         NA        1.0        2.0          3
Julius Vainora
  • 47,421
  • 9
  • 90
  • 102