Beginner question coming, sorry... I have got a dataset with structure as below:
dat.1<-data.frame(id=c(1,1,1,2,2,2),test=c("test.1","test.2","test.3"),result=c(1,2,1,2,2,1))
dat.1
id test result
1 1 test.1 1
2 1 test.2 2
3 1 test.3 1
4 2 test.1 2
5 2 test.2 2
6 2 test.3 1
The actual dataset currently has 32 tests, and 1000< ID numbers, result is always binary - number of tests can increase, as will ID. I want to re-arrange the data so that each test i.e. 'test.1' has a column like so:
dat.3<-data.frame(id=c(1,2),test.1=c(1,2),test.2=c(2,2),test.3=c(1,1))
dat.3
id test.1 test.2 test.3
1 1 1 2 1
2 2 2 2 1
A small complication of this is that not every ID has undergone every test, so any solution will have to cope with NA's. Just to clarify in dat.3, the column contents for the tests is the result column from dat.1.
At the moment I have gotten as far as creating an 'empty' data frame which can adapt to new tests being added like so:
dat.2<-data.frame(id=c(1,2),test.1=c(NA,NA),test.2=c(NA,NA),test.3=c(NA,NA))
dat.2
id test.1 test.2 test.3
1 1 NA NA NA
2 2 NA NA NA
I've been experimenting with ifelse with the logic of IF dat.1$id == dat.2$id & dat.1$test=="test.1" then where dat.2 col= test.1, input dat.1$result in dat.3$test.1 - if that makes any sense at all! Predictably haven't had any luck and feel like I'm missing a really obvious step/over-complicating things, so any help would be greatly appreciated - thanks
EDIT: Thanks for comments - reshape has begun to be helpful; however, I think I tried to over-simplify with the example above. I have put a new example dataset below:
dat.4<-data.frame(id=c(1,1,1,1,1,1,2,2,2),result=c(1,1,1,2,2,2,3,3,3),
test=c("test.1","test.2","test.3"),result=c(1,2,1,2,2,2,2,2,1))
dat.1
id result test result.1
1 1 1 test.1 1
2 1 1 test.2 2
3 1 1 test.3 1
4 1 2 test.1 2
5 1 2 test.2 2
6 1 2 test.3 2
7 2 3 test.1 2
8 2 3 test.2 2
9 2 3 test.3 1
So, each ID (actually a sample ID) has had a test which has qualified it for this further test - this test can have a single, or multiple outcomes. As such, in the example above the eventual data structure would look like this:
dat.3<-data.frame(id=c(1,1,2),result=c(1,2,3),test.1=c(1,2,2),test.2=c(2,2,2),
test.3=c(1,2,1))
dat.3
id result test.1 test.2 test.3
1 1 1 1 2 1
2 1 2 2 2 2
3 2 3 2 2 1
So really what i would be looking for is a reshape based on two column conditions - does this make sense?