I have a wide data frame that I want to convert to a long dataframe.
This isn't the actual wide dataframe I'm using. There are many more courses and many more 'values' for each course, so the dataframe is much wider than this. Not every course has all value columns associated with it (hence why bio1Csem is not there in the below dataframe).
I decided to attempt the solution on a smaller dataframe since I was running into so many issues on the larger dataframe). Unfortunately, I'm still struggling.
The data frame I'm working with:
>X = rbind( c( "1", "2.5","3.7","","2006 Fall","2007 Fall","Smith","Hu",""),
c( "2" ,"3.7", "3.7", "3.5", "2007 Spring", "2007 Fall",
"Smith","Hu","Langdon"), c("3" ,"4", "3.2", "4", "2007 Spring", "2007 Fall",
"Smith","Hu","Langdon"))
> colnames(X) = c('id','bio1Agrade','bio1Bgrade','bio1Cgrade','bio1Asem',
'bio1Bsem','bio1Aprof', 'bio1Bprof','bio1Cprof')
> X
id bio1Agrade bio1Bgrade bio1Cgrade bio1Asem bio1Bsem bio1Aprof bio1Bprof bio1Cprof
[1,] "1" "2.5" "3.7" "" "2006 Fall" "2007 Fall" "Smith" "Hu" ""
[2,] "2" "3.7" "3.7" "3.5" "2007 Spring" "2007 Fall" "Smith" "Hu" "Langdon"
[3,] "3" "4" "3.2" "4" "2007 Spring" "2007 Fall" "Smith" "Hu" "Langdon"
I want it to look like this instead:
id course grade semester prof
1 bio1A 2.5 2006 Fall Smith
1 bio1B 3.7 2007 Fall Hu
1 bio1C
2 bio1A 3.7 2007 Spring Smith
2 bio1B 3.7 2007 Fall Hu
2 bio1C 3.5 Langdon
3 bio1A 4 2007 Spring Smith
3 bio1B 3.2 2007 Fall Hu
3 bio1C 4 Langdon
I figured reshape won't work since all my columns names are only characters without any obvious separators, and not all courses have, in this case, 3 columns that correspond to it.
I also thought about attempting a solution using tidyr, I'm struggling how to use it for multiple values.
Do any of you have a suggestion on how to go about this problem? Would it be easier to rename the columns, and add empty columns for courses that are 'missing' a column and use reshape? Is there another, ideally easier, way?