1

I'm trying to make clustering model with Edward or tensorflow probability about the data whose features are grouped and countable.
The data is as below.

{'data_point_1': [[0, 1], [1, 2], [3, 1], [2, 1]},
'data_point_2': [[1, 3], [2, 8], [2, 2], [5, 1],
...
}

On the case above, each data point has four groups. And inside the group and between groups there are some correlations.

Please let me know how to write the clustering model with Edward on this case.

I tried some modelings and nothing works from the viewpoint of two aspects. One, what is the proper definition of model. Two, how can I express that in Edward. The following code is one of the easy trial and actually for me, it doesn't make sense in the both two.

import edward as ed
from edward.models import Dirichlet, Categorical, Mixture, Beta
import tensorflow as tf

cluster_number = 2
d_size = 10

r = Dirichlet(tf.ones(cluster_number))
z = Categorical(r)
for i in range(d_size):
    exec(f"p_{i} = Beta(0.5, 0.5, sample_shape=cluster_number)")

latent_variable = tf.Variable(tf.zeros([d_size, 1]))
latent_variable_1 = tf.Variable(tf.zeros([d_size, 2]))

layer_1 = tf.reshape(tf.concat([eval(f"p_{i}") for i in range(d_size)], axis=0),
                     shape=[cluster_number, d_size])
layer_2 = tf.matmul(tf.cast(layer_1, dtype=tf.float32), latent_variable)

compornents = [layer_2] * cluster_number
m = Mixture(z, components)
out = tf.matmul(m, latent_variable_1)


T = 500
qr = Dirichlet(tf.ones(cluster_number))
qz = Categorical(tf.ones(cluster_number))
qp = Beta(tf.ones(cluster_number), tf.ones(cluster_number))

inference = ed.Gibbs({r: qr, z: qz, p: qp},
                     data={out: x_train})
inference.run()
Shuhei Kishi
  • 101
  • 2
  • 7

0 Answers0