16

I'm looking for an overview of the state-of-the-art methods that

  • find temporal patterns (of arbitrary length) in temporal data

  • and are unsupervised (no labels).

In other words, given a steam/sequence of (potentially high-dimensional) data, how do you find those common subsequences that best capture the structure in the data.

  1. Any pointers to recent developments or papers (that go beyond HMMs, hopefully) are welcome!

  2. Is this problem maybe well-understood in a more specific application domain, like

    • motion capture
    • speech processing
    • natural language processing
    • game action sequences
    • stock market prediction?

  3. In addition, are some of these methods general enough to deal with
    • highly noisy data
    • hierarchical structure
    • irregularly spacing on time axis

(I'm not interested in detecting known patterns, nor in classifying or segmenting the sequences.)

schaul
  • 1,021
  • 9
  • 21
  • user1149913's answer is very useful, but I am still looking for alternative methods, maybe outside of HMMs... so keep posting! – schaul Aug 16 '12 at 21:52

2 Answers2

7

There has been a lot of recent emphasis on non-parametric HMMs, extensions to infinite state spaces, as well as factorial models, explaining an observation using a set of factors rather than a single mixture component.

Here are some interesting papers to start with (just google the paper names):

  • "Beam Sampling for the Infinite Hidden Markov Model"
  • "The Infinite Factorial Hidden Markov Model"
  • "Bayesian Nonparametric Inference of Switching Dynamic Linear Models"
  • "Sharing features among dynamical systems with beta processes"

The experiments sections these papers discuss applications in text modeling, speaker diarization, and motion capture, among other things.

user1149913
  • 4,463
  • 1
  • 23
  • 28
  • Thanks a lot! The last two of these references look particularly interesting. I'll report back after some more in-depth reading. – schaul Aug 09 '12 at 22:16
  • user1149913: I have a follow-up question: It appears to me that these methods only work if they can model the whole data. Do you know of related methods that find temporal patterns, even if they are embedded in hard-to-model noise? – schaul Aug 16 '12 at 21:49
  • You could try looking at "Profile HMMs" (www.cs.princeton.edu/~mona/Lecture/HMM1.pdf is pretty good). I have not had much success training these, though, unless patterns are very clear. Some of the speech recognition literature ("Statistical methods for speech recognition" - Jelinek) discusses HMMs with similar restrictions on transsitions. – user1149913 Aug 17 '12 at 17:52
  • Your best best might be to try to manually restrict HMM transitions matrix to try to capture the type of structure you are looking for. (see HMMER) – user1149913 Aug 17 '12 at 17:54
0

I don't know the kind of data you are analysing, but I would suggest(from a dynamical systems analysis point of view), to take a look at:

  • Recurrence plots (easily found googling it)
  • Time-delay embedding (may unfold potential relationships between the different dimensions of the data) + distance matrix(study neighborhood patterns maybe?)

Note that this is just another way to represent your data, and analyse it based on this new representation. Just a suggestion!

lllllll
  • 4,715
  • 6
  • 29
  • 42