i need to know how to use HMM on top of Apache Spark. Its not present in MLlib. Is there any alternatives ?
Thanks
Elsayed
i need to know how to use HMM on top of Apache Spark. Its not present in MLlib. Is there any alternatives ?
Thanks
Elsayed
Best I can find is a 2 year old implementation on spark.
You might want to investigate using something other than spark or HMM or just bite the bullet and implement it yourself. Implementing the viterbi algorithm is not particularly hard, here is my many years old implementation.
HMM
algorithm - excerpts from https://en.wikipedia.org/wiki/Hidden_Markov_model
Hidden Markov Model (HMM)
is a statistical Markov model in which the system being modeled is assumed to be aMarkov process
with unobserved (i.e. hidden) states. The hidden markov model can be represented as the simplestdynamic Bayesian network
.A hidden Markov model can be considered a generalization of a
mixture model
where thehidden variables
(orlatent variables
), which control the mixture component to be selected for each observation, are related through a Markov process rather than independent of each other.Applying the principle of
dynamic programming
, this problem, too, can be handled efficiently using theforward algorithm
.
Have not seen algorithms around the above concepts implemented on Spark
.
Spark
can support "beyond map-reduce" algorithms but the only thing with dynamic programming
I could find was https://github.com/bbengfort/brisera
A Python implementation of a distributed seed and reduce algorithm (similar to BlastReduce and CloudBurst) that utilizes RDDs (resilient distributed datasets) to perform fast iterative analyses and
dynamic programming
without relying on "chainedMapReduce
jobs".
Mahout
has an HMM
implementation but unsure if it is distributed
https://mahout.apache.org/users/classification/hidden-markov-models.html