Questions tagged [sequence-analysis]

Sequence analysis (in the social sciences) is the analysis of how people or other units of study move from one state to another (for example, single-->married-->widowed, unemployed-->employed-->retired) over the course of their lifespan.

35 questions
6
votes
4 answers

Pattern in continuous sequence data

Suppose I have a list of events. For example A, D, T, H, U, A, B, F, H, .... What I need is to find frequent patterns that occur in the complete sequence. In this problem we cannot use traditional algorithms like a priori or fp growth because they…
Haris
  • 12,120
  • 6
  • 43
  • 70
3
votes
1 answer

Regression tree size in the context of state sequence analysis using TraMineR in R

I am conducting a regression tree using state sequence analysis and I want the image output to have the dimensions of a Letter size paper (landscape). When I use the code I include the regression tree appears as a separate window, always of the same…
3
votes
1 answer

TraMineR Using Weights

I am still new to TraMineR; therefore, my problem might be very simple for most of you. I am working on some sequence plots with my data and would like to see the results with the survey weights and nominal weights. I am able to import data into R…
2
votes
3 answers

Traminer R for sequence analysis: how to account for state order besides spell lenght?

I'm doing sequence analysis with Traminer on R and I would like to take into account only the order of different spells over time. For instance, I would like that the sequence A-B-A would be considered the same as A-B-B-B-A when plotting the most…
ggg
  • 73
  • 1
  • 7
2
votes
1 answer

R: TraMineR Conversion Between sequence formats SPELL to STS with out dates?

I am trying to study the volunteer trajectories of a group of individuals. My data looks like something like this. ID Program Area Impact Area Hours Served Organization Served 1 Tutoring Education 2 org 1 1 Hunger …
2
votes
3 answers

How to get the largest possible column sequence with the least possible row NAs from a huge matrix?

I want to select columns from a data frame so that the resulting continuous column-sequences are as long as possible, while the number of rows with NAs is as small as possible, because they have to be dropped afterwards. (The reason I want to do…
jay.sf
  • 60,139
  • 8
  • 53
  • 110
2
votes
1 answer

Fitting a VLMC to very long sequences

I am trying to fit a VLMC to a dataset where the longest sequence is 296 states. I do it as shown below: # Load libraries library(PST) library(RCurl) library(TraMineR) # Load and transform data x <-…
histelheim
  • 4,938
  • 6
  • 33
  • 63
2
votes
1 answer

Predicting conditional probabilities based on contexts with only 1 state

It seems that PST cannot predict the conditional probabilities of the next state after contexts which consist of a single state, e.g. EX-EX Consider this code: # Load libraries library(RCurl) library(TraMineR) library(PST) # Get data x <-…
histelheim
  • 4,938
  • 6
  • 33
  • 63
2
votes
1 answer

Calculate lift for context-state relationship in a probabilistic suffix tree?

PST gives me probabilities and conditional probabilities for various contexts and following states. However, it would be very helpful to be able to calculate the lift (and its significance) of the relationship between a context and a following…
histelheim
  • 4,938
  • 6
  • 33
  • 63
2
votes
1 answer

Where in the sequence of a Probabilistic Suffix Tree does "e" occur?

In my data there are only missing data (*) on the right side of the sequences. That means that no sequence starts with * and no sequence has any other markers after *. Despite this the PST (Probabilistic Suffix Tree) seems to predict a 90% chance of…
histelheim
  • 4,938
  • 6
  • 33
  • 63
2
votes
2 answers

Getting log-likelihood from probabilistic suffix tree

Here is my code: library(RCurl) library(TraMineR) library(PST) x <- getURL("https://gist.githubusercontent.com/aronlindberg/08228977353bf6dc2edb3ec121f54a29/raw/c2539d06771317c5f4c8d3a2052a73fc485a09c6/challenge_level.csv") data <- read.csv(text =…
histelheim
  • 4,938
  • 6
  • 33
  • 63
2
votes
1 answer

Detecting sequencing using regexes

Imagine I have multiple character strings in a list like this: [[1]] [1] "1-FA-1-I2-1-I2-1-I2-1-EX-1-I2-1-I3-1-FA-1-" [2] "-1-I2-1-TR-1-" [3] "-1-I2-1-FA-1-I3-1-" [4]…
histelheim
  • 4,938
  • 6
  • 33
  • 63
2
votes
2 answers

How to identify sequences within each leaf from a regression tree?

Using the biofam dataset library(TraMineR) data(biofam) lab <- c("P","L","M","LM","C","LC","LMC","D") biofam.seq <- seqdef(biofam[,10:25], states=lab) head(biofam.seq) Sequence 1167…
histelheim
  • 4,938
  • 6
  • 33
  • 63
1
vote
2 answers

Convert long data.frame to sequence in TraMineR

I have a data.frame in long format, that I want to convert to a TraMineR sequence object. set.seed(1) df <- data.frame(year = rep(1990:2010, 3), id = rep(1:3, each = 21), value = sample(10, 63, replace =…
Maël
  • 45,206
  • 3
  • 29
  • 67
1
vote
1 answer

Extracting a portion of the generated Representative Sequences

So, I have a set of 893 sequences of varying lengths with max sequence length = 152. There are 10 unique states across all of them. These sequences are split into two groups: Promoted and Not Promoted. Using TramineR, I generated representative…
1
2 3