Questions tagged [dirichlet]

The Dirichlet distribution is a family of continuous multivariate probability distributions.

The Dirichlet distribution is a family of continuous multivariate probability distributions. It is the multivariate generalization of the beta distribution. Dirichlet distributions are very often used as prior distributions in Bayesian statistics, and in fact the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution.

93 questions
29
votes
5 answers

Understanding LDA implementation using gensim

I am trying to understand how gensim package in Python implements Latent Dirichlet Allocation. I am doing the following: Define the dataset documents = ["Apple is releasing a new product", "Amazon sells many things", …
visakh
  • 2,503
  • 8
  • 29
  • 55
14
votes
1 answer

R Supervised Latent Dirichlet Allocation Package

I'm using this LDA package for R. Specifically I am trying to do supervised latent dirichlet allocation (slda). In the linked package, there's an slda.em function. However what confuses me is that it asks for alpha, eta and variance parameters. As…
Alex R.
  • 1,397
  • 3
  • 18
  • 33
6
votes
3 answers

How to get N random integer numbers whose sum is equal to M

I want to make a list of N random INTEGER numbers whose sum is equal to M number. I have used numpy and dirichlet function in Python, but this generate double random number array, I would like to generate integer random number. import numpy as np…
Julian Solarte
  • 555
  • 6
  • 29
5
votes
0 answers

LDA Gensim/Mallet documentation on alpha

I'm a little bit confused about the comments to alpha in the documentation of LDA (Gensim). In the "regular" Gensim LdaModel it says that if one sets alpha = 'asymmetric', Gensim uses a "fixed normalized asymmetric prior of 1.0 / topicno" (topicno…
Stockfish
  • 183
  • 1
  • 8
5
votes
1 answer

pymc3 : Dirichlet with multidimensional concentration factor

I am struggling with implementing a model where the concentration factor of the Dirichlet variable is dependent on another variable. The situation is the following: A system fails due to faulty components (there are three components, only one fails…
Hugo
  • 53
  • 4
5
votes
1 answer

Python package :MLE for Dirichlet distribution

I was wondering if someone knew about a python package that implements MLE to estimate parameters of a Dirichlet distribution.
cryp
  • 2,285
  • 3
  • 26
  • 33
5
votes
1 answer

Is there an R package for learning a Dirichlet prior from counts data

I'm looking for a an R package which can be used to train a Dirichlet prior from counts data. I'm asking for a colleague who's using R, and don't use it myself, so I'm not too sure how to look for packages. It's a bit hard to search for, because…
Alex Coventry
  • 68,681
  • 4
  • 36
  • 40
5
votes
4 answers

Document similarity

I used tf/idf to calculate consine similarity between two documents. It has some limitation and does not perform very well. I looked for LDA (latent dirichlet allocation) to calculate document similarity. I don't know much about this. I couldn't…
user238384
  • 2,396
  • 10
  • 35
  • 36
4
votes
2 answers

LDA and topic model

I have studied LDA and Topic model for several weeks.But due to my poor mathematics ability, i can not fully understand its inner algorithms.I have used the GibbsLDA implementation, input a lot of documents, and set topic number as 100, i got a file…
ShenYi
  • 121
  • 1
  • 7
4
votes
1 answer

what does numpy.random.dirichlet do?

I need a Dirichlet distribution and I am using numpy.random.dirichlet. when I give alpha=[1,1,1,1] according to the Dirichlet PDF formula, it should result a uniform function. but it doesn't give me a uniform vector. anybody knows why?
bbb
  • 111
  • 3
  • 8
4
votes
0 answers

3D Dirichlet ternary plot

I am trying to create a 3D(4D?) Dirichlet probability density function plot similar to one of these (from Wikipedia): My data consists of 3 columns and 100000 rows where each row sums to 1 obtained from the rdirichlet function. I can create a…
flee
  • 1,253
  • 3
  • 17
  • 34
4
votes
1 answer

DP-GMM and online cluster assignment

I expected scikit-learn's DP-GMM to allow for online update of cluster assignments given new data, but sklearn's implementation of DP-GMM only has a fit method. My understanding of variational inference is yet unclear and I think that the inability…
rafaelvalle
  • 6,683
  • 3
  • 34
  • 36
4
votes
0 answers

JAGS - unable to find appropriate sampler

I am trying to develop a hierarchical Dirichlet-multinomial process hidden Markov model in JAGS to estimate multiparty, primary voting intention based on opinion poll results. I also use the primary vote estimate to calculate a two-party preferred…
Mark Graph
  • 4,969
  • 6
  • 25
  • 37
4
votes
1 answer

Dirichlet-Multinomial WinBUGS code

I'm trying to code a dirichlet-multinomial model using BUGS. Basically I have 18 regions and 3 categories per region. In example, Region 1: 0.50 belongs to Low, 0.30 belongs to Middle, and 0.20 belongs to High. The list goes on to Region 18 of…
user3764358
  • 41
  • 1
  • 2
4
votes
0 answers

cdf for Dirichlet distribution

I want to run an estimation assuming that my variables are distributed according to the Dirichlet distribution. To do so, I need to use the cdf function. For all the distributions in R, there are the respective r,p and d functions that produce…
KGeor
  • 107
  • 6
1
2 3 4 5 6 7