I'm trying to format my data in a 'readable' way where I have multiple columns with the same name. I tried using the melt() function, but I failed to solve the problem, which seems to be related to the fact that there are different values on the variables.
A small example of the data:
obs m ti td date class code dis group status grade freq date dis group status grade freq date dis group status grade freq date
obs_1 A grad 05/01/2016 00:00 55060 DDE0300 2016101 A 5.7 97 05/01/2016 15:20 MS0230 2016101 A 8.19 100 05/01/2016 15:20 A0301 2016101 A 5.8 100 27/01/2016 13:12
obs_2 A grad 05/01/2016 00:00 55070 SSE332 0 D 03/06/2016 14:08 A0804 0 D 03/06/2016 14:18 SE089 0 D 26/08/2016 19:31
And now I want to split this data frame by observation:
melt(df[1,],id.vars=c("obs","m","ti","td","date","class","code"),
measure.vars=c("dis","group","status","grade","freq","date"))
I get:
obs m ti td date class code variable value
1 obs_1 A grad NA 05/01/2016 15:20 NA 55060 dis DDE0300
2 obs_1 A grad NA 05/01/2016 15:20 NA 55060 group 2016101
3 obs_1 A grad NA 05/01/2016 15:20 NA 55060 status A
4 obs_1 A grad NA 05/01/2016 15:20 NA 55060 grade 5.7
5 obs_1 A grad NA 05/01/2016 15:20 NA 55060 freq 97
6 obs_1 A grad NA 05/01/2016 15:20 NA 55060 date 05/01/2016 15:20
Warning message:
attributes are not identical across measure variables; they will be dropped
Now, I'm 'missing' two columns, which would be MS0230 and A0301 and their group, status, and so on. How can I fix this?
Keep in mind, it doesn't have to be using the melt() function.
Code to reproduce the data:
df<-structure(list(obs = structure(1:2, .Label = c("obs_1", "obs_2"
), class = "factor"), m = structure(c(1L, 1L), .Label = "A ", class = "factor"),
ti = structure(c(1L, 1L), .Label = "grad", class = "factor"),
td = c(NA, NA), datei = structure(c(1L, 1L), .Label = "05/01/2016 00:00", class = "factor"),
class = c(NA, NA), code = c(55060L, 55070L), dis = structure(1:2, .Label = c("DDE0300",
"SSE332"), class = "factor"), group = c(2016101L, 0L), status = structure(1:2, .Label = c("A ",
"D "), class = "factor"), grade = c(5.7, NA), freq = c(97L,
NA), date = structure(c(2L, 1L), .Label = c("03/06/2016 14:08",
"05/01/2016 15:20"), class = "factor"), dis = structure(c(2L,
1L), .Label = c("A0804", "MS0230"), class = "factor"), group = c(2016101L,
0L), status = structure(1:2, .Label = c("A ", "D "), class = "factor"),
grade = c(8.19, NA), freq = c(100L, NA), date = structure(c(2L,
1L), .Label = c("03/06/2016 14:18", "05/01/2016 15:20"), class = "factor"),
dis = structure(1:2, .Label = c("A0301", "SE089"), class = "factor"),
group = c(2016101L, 0L), status = structure(1:2, .Label = c("A ",
"D "), class = "factor"), grade = c(5.8, NA), freq = c(100L,
NA), date = structure(c(2L, 1L), .Label = c("26/08/2016 19:31",
"27/01/2016 13:12"), class = "factor")), .Names = c("obs",
"m", "ti", "td", "datei", "class", "code", "dis", "group", "status",
"grade", "freq", "date", "dis", "group", "status", "grade", "freq",
"date", "dis", "group", "status", "grade", "freq", "date"), class = "data.frame", row.names = c(NA,
-2L))