There are three issues we need to solve to be able to use these data with TraMineR
.
Time must be discreet because it is used to determine positions or differences between positions in a discreet sequence. Here a solution is to transform hours into quarters of hours.
The only time information provided is Hours Served, i.e. durations. We need additional information (or assumptions) to transform these durations into start and end times. I will assume each individual (id) is observed from time 1 and the hours served are consecutive. Thus, begin time will be 1 for the first spell, 1 plus the duration of the first spell for the second spell, and so on. End time will be the duration of the spell for the first spell, and the previous end time plus the spell duration for the next spells.
There are three categorical variables and it is not clear what should be used as status variable. I will assume that the status is the interaction between the Program Area and the organization number.
The code below illustrates these transformations:
library(TraMineR)
dat <- read.table(header=TRUE, text="
ID Program.Area Impact.Area Hours.Served Organization.Served x
1 Tutoring Education 2 org 1
1 Hunger Basic.Needs .25 org 2
1 Gardening Beautification 1 org 3
2 Tutoring Education 2 org 4
3 Hunger Basic.Needs 3 org 2
3 Hunger Basic.Needs 1 org 2
4 Tutoring Education 1.5 org 1
4 Tutoring Education 1.5 org 1
4 Tutoring Education 2 org 4
5 Hunger Basic.Needs 1 org 2
5 Hunger Basic.Needs 1 org 5
")
Need discreet time
dat[,4] <- 4*dat[,4]
names(dat)[4] <- "Quarter.Hours.Served"
Computing begin and end times assuming Hours.Served
are consecutive and first spells start at 1.
k <- ncol(dat) + 1
dat[,k] <- 1
dat[,k+1] <- dat[,4]
names(dat)[k] <- "Begin"
names(dat)[k+1] <- "End"
for (i in 2:nrow(dat)) {
if (dat[i-1,1]==dat[i,1]) {
dat[i,k] <- dat[i-1,k+1] + 1
dat[i,k+1] <- dat[i,4] + dat[i-1,k+1]
}
}
Status as interaction between Program Area and org number
dat[,k+2] <- interaction(dat[,2],dat[,"x"])
names(dat)[k+2] <- "Status"
dat[,c(1,k,k+1,k+2)]
# ID Begin End Status
# 1 1 1 8 Tutoring.1
# 2 1 9 9 Hunger.2
# 3 1 10 13 Gardening.3
# 4 2 1 8 Tutoring.4
# 5 3 1 12 Hunger.2
# 6 3 13 16 Hunger.2
# 7 4 1 6 Tutoring.1
# 8 4 7 12 Tutoring.1
# 9 4 13 20 Tutoring.4
# 10 5 1 4 Hunger.2
# 11 5 5 8 Hunger.5
Transforming spell data into STS form
and creating the state sequence object
s.dat <- seqformat(dat[,c(1,k,k+1,k+2)], from="SPELL", to="STS",
limit=max(dat[,k+1]))
seq <- seqdef(s.dat, cnames=1:20)
print(seq, format="SPS")
# Sequence
# 1 (Tutoring.1,8)-(Hunger.2,1)-(Gardening.3,4)
# 2 (Tutoring.4,8)
# 3 (Hunger.2,16)
# 4 (Tutoring.1,12)-(Tutoring.4,8)
# 5 (Hunger.2,4)-(Hunger.5,4)
seqiplot(seq)
