1

I try to classify text data, where df['Addr'] is X and df['Reg'] is y

                                                    Reg
Addr                                                   
640022, РОССИЯ, КУРГАНСКАЯ ОБЛ, Г КУРГАН, УЛ ГО...   45
624214, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, Г ЛЕСНОЙ, РП ...   66
454018, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ЧЕЛЯБИНСК, У...   74
624022, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, СЫСЕРТСКИЙ Р-...   66
454047, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ЧЕЛЯБИНСК, У...   74
456787, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ОЗЕРСК, УЛ Г...   74
450075, РОССИЯ, БАШКОРТОСТАН РЕСП, Г УФА, ПР-КТ...    3
623854, РОССИЯ, СВЕРДЛОВСКАЯ ОБЛ, Г ИРБИТ, УЛ С...   66
457101, РОССИЯ, ЧЕЛЯБИНСКАЯ ОБЛ, Г ТРОИЦК, УЛ С...   74
640008, РОССИЯ, КУРГАНСКАЯ ОБЛ, Г КУРГАН, ПР-КТ...   45

I try to use 1 layer tensorflow to classify address, but it returns all 0 instead relevant regions.

I use code

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['Addr'])
X = csr_matrix(X).todense()

X_train, X_test, y_train, y_test = train_test_split(X, df['Reg'].values.reshape(-1, 1), shuffle=True, test_size=0.2)

# tf
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

def random_batch(X_train, y_train, batch_size):
   rnd_indices = np.random.randint(0, X_train.shape[0], batch_size)
   X_batch = X_train[rnd_indices]
   y_batch = y_train[rnd_indices]
   return X_batch, y_batch

reset_graph()

X = tf.placeholder(tf.float32, shape=(None, X_train.shape[1]), name="input")
y = tf.placeholder(tf.float32, shape=(None, y_train.shape[1]), name="y")
y_cls = tf.argmax(y, axis=1)

weights = tf.Variable(tf.truncated_normal([X_train.shape[1], y_train.shape[1]], stddev=0.05), name="weights", trainable=True)
bias = tf.constant(1.0, shape=[y_train.shape[1]], name="bias")

layer_1 = tf.nn.relu_layer(X, weights, bias, name="relu_layer")
outs = tf.nn.softmax(layer_1, name="outs")
y_pred = tf.argmax(outs, axis=1)

cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=layer_1, labels=y)
cost = tf.reduce_mean(cross_entropy)
acc = tf.cast(tf.equal(y_pred, y_cls), tf.float16)
predicted = tf.reduce_sum(acc)

learning_rate = 0.01
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(cost)

init = tf.global_variables_initializer()

n_epochs = 100
batch_size = 500
n_batches = int(np.ceil(1000 / batch_size))

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = random_batch(X_train, y_train, batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
        loss_val = cost.eval({X: X_test, y: y_test})
        if epoch % 10 == 0:
            print("Epoch:", epoch, "\tLoss:", loss_val)

    y_proba_val = y_pred.eval(feed_dict={X: X_test, y: y_test})

print(y_test.reshape(1, -1))
print(y_proba_val.reshape(1, -1))

Result of this code:

Epoch: 0    Loss: 0.0
Epoch: 10   Loss: 0.0
Epoch: 20   Loss: 0.0
Epoch: 30   Loss: 0.0
...
Epoch: 90   Loss: 0.0
[[ 3 66 66 ... 66 66 66]]
[[0 0 0 ... 0 0 0]]

I can't find an error in my program. I've read that softmax usually use in classifying tasks, but I'm not confident in my actions. Why it returns predictions with 0?

Petr Petrov
  • 4,090
  • 10
  • 31
  • 68

1 Answers1

1

I'm pretty sure that your network currently looks like this: (excuse my paint skills) Neural Network?

If you're not going to come up with features for the different addresses on your own, I suggest you add at least one hidden layer so that the network can attempt to create its own features. Currently there's only one weight per connection to tweak, and that's going to result in a VERY weak classifier.

I believe that's the root of the problem, but I'm not entirely sure why your Loss is always 0.0, I'll continue looking, but this is some food for thought.

EDIT: The logits argument supposed to represent the predicted output of the network (distribution of probabilities), so I'd set that to y_pred.

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=outs, labels=y)
cyniikal
  • 134
  • 1
  • 6
  • You mean specify `cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_pred, labels=y)` instead `logits=layer_1`? – Petr Petrov Aug 22 '18 at 09:57
  • Yes, that's exactly what I meant. – cyniikal Aug 22 '18 at 10:24
  • If I add there `outs` instead `layer_1` if works, but returns zeros again, but `y_pred` gives error `TypeError: Value passed to parameter 'features' has DataType int64 not in list of allowed values: float16, bfloat16, float32, float64` – Petr Petrov Aug 22 '18 at 10:32
  • Sorry, I misstyped, it should be "outs", since that's the output activations for all of your output nodes, going to try to figure out why your loss is always 0 now. However, I do think you should add at least one hidden layer with several nodes to the network to make it a more powerful classifier. – cyniikal Aug 22 '18 at 10:36
  • Can you help me with one more question. If I want to get predictions, what should I use? I've tried `y_pred.eval(feed_dict={X: X_test}, session=sess)` but I get `ValueError: setting an array element with a sequence.` – Petr Petrov Aug 22 '18 at 13:38
  • A couple things could be going wrong here. The list of possible answers is pretty long, so I'd suggest you look at https://stackoverflow.com/questions/4674473/valueerror-setting-an-array-element-with-a-sequence – cyniikal Aug 22 '18 at 14:11