1

How can I combine multiple emmision spectra about the same markov states?

Let's use the classical HMM example:

% states
S = {sunny, rainy, foggy}

% discrete observations
x = {umbrella, no umbrella}

Now what if I had multiple observation sequences. E.g.:

% sequence 1
x1 = {umbrella, no umbrella}

% sequence 2
x2 = {wearing a coat, not wearing a coat}

How can I combine these two observation sequences into one HMM?

Note: I would like a way to combine x1 and x2 such that their inter-dependencies are also modelled. Therefore simply saying x={x1 x2} would (IMO) not be a good solution.


Specifically, I want to train a HMM based on Matlab's hmmtrain:

[ESTTR,ESTEMIT] = hmmtrain(seq,TRGUESS,EMITGUESS)

This only allows me to insert one seq.

Now let's say I have 5 different emmision spectra which all say something about the states of the HMM. How can I handle this multivariate case?

Jean-Paul
  • 19,910
  • 9
  • 62
  • 88

2 Answers2

4

How about taking the Cartesian product of the possible observations from each set. That is, your new discrete emission model will be:

  • umbrella and wearing-a-coat
  • umbrella and not-wearing-a-coat
  • no-umbrella and wearing-a-coat
  • no-umbrella and not-wearing-a-coat
Amro
  • 123,847
  • 25
  • 243
  • 454
  • obviously this will be a [multinomial distribution](https://en.wikipedia.org/wiki/Multinomial_distribution) which HMMTRAIN can handle – Amro Oct 30 '14 at 14:39
  • 1
    But what if I have 20 emission sequences and 4 different states per sequence? Taking the cartesian product would explode the problem to a huge computation space... – Jean-Paul Oct 30 '14 at 15:23
  • 1
    the terminology you're using is a bit confusing... Those values I proposed above are the set of possible "observations". A **sequence** is simply a series of observations over time... In other words, at any point in time `t(i)`, if we are in **state** `s(k)` it would emit one of the above discrete **observations** according to the emission model probability distribution (think rolling a dice). Obviously you don't wanna have too many observations or too many states, otherwise you would need lots and lots of data to train the HMM model successfully (curse of dimensionality) – Amro Oct 30 '14 at 15:51
  • Let me formulate it again: Let's say I have one variable that emits the following discrete observations: {a, b, c}. I also have another variable that emits the following discrete observations: {1, 2, 3}. Both have information about the state of the DGP. And both happen at the same time (dynamic HMM). How can I combine the emission sequences of both variables to estimate the state of the DGP? – Jean-Paul Oct 30 '14 at 15:58
  • what do you mean by "variable" exactly? It is a latent/hidden state of the HMM? Let me ask you this, does the same hidden state of the HMM emit an observation from {a,b,c} and an observation from {1,2,3} simultaneously, or are each produced by a different set of states? – Amro Oct 30 '14 at 16:04
  • Each are produced by independent hidden states and each hidden state reveals information (observations) about the overall state of the whole process. However, observations might be correlated with the overall state of the whole process. Does that make sense? – Jean-Paul Oct 30 '14 at 16:06
  • If I understand correctly, in that case you would have two separate and independent HMM models, and it wouldn't make sense to combine them... Also I don't think you've concretely defined the hidden states of the process. Take the weather example you posted; the act of bringing an umbrella or not, or wearing a coat or not are both observations that reveal information about the same hidden states, namely the weather condition. Do you see the problem in your previous statement? – Amro Oct 30 '14 at 16:11
  • Yes I see. So then the two variables *do* reveal information about the same hidden state but have a different DGP. What would that leave me? – Jean-Paul Oct 30 '14 at 16:15
  • sorry, what does DGP stand for? – Amro Oct 30 '14 at 16:17
  • Data Generating Process – Jean-Paul Oct 30 '14 at 16:20
  • ok, thanks. So what's the problem with using the Cartesian product? You would be modelling states that generate tuples of observations, which is what you want, right (observing different things from the same hidden states)? If the number of observations becomes too large, then it's a data processing problem, not algorithmic -- try to pick the ones that are meaningful and drop less relevant ones. For example with the weather data, say you could also observe whether the person is happy or sad, while it is an extra piece of information, it is not as important as the other two (umbrella and coat) – Amro Oct 30 '14 at 16:28
  • you have to understand that observing an extra piece of evidence does not change the underlying process, it simply reveals extra information for you to make better inference (it has already been decided what the condition of the weather was at that point in time, and the fact the person wore a coat won't change the weather!). So the the two "variables" do have the same DGP... – Amro Oct 30 '14 at 16:33
  • I think the Cartesian product makes sense. It's just that I don't know if that solution would allow for the observations to influence each other dynamically like in `coupled HMM` models. Or does it? (From the first order markov condition?) – Jean-Paul Oct 30 '14 at 16:35
  • Do you know how I could make matlab's HMM model dynamic such that I can feed it the tuples of observations while maintaining the time-dependency? – Jean-Paul Oct 30 '14 at 16:41
  • 1
    you don't have to do anything special, just combine the observations as tuples, and call it a new thing. As far as MATLAB knows, it just sees regular discrete observations. So you would have a new set of observations {O1,O2,O3,O4} where each `O_i` denotes the combinations listed in my answer above – Amro Oct 30 '14 at 16:44
1

What about creating preconditions to choose a special HMM? Instead of a huge HMM you can create several small HMMs und you choose only the relevant HMM. For example: if (umbrella=true) then apply HMM_1 else apply HMM_2. Then, you also have fewer emission symbols in a HMM. Nice side effect: You save training and testing time.

Lia
  • 11
  • 1