4

I am using Tensorflow on Handwritten A_Z dataset on Kaggle Kernel.

I've used 2 conv layers along with 2 maxpool one after another and then reshaped the above layer into full_1 (-1,*7*7*64) and further it to a fully_connected layer(full_2 to which I applied dropout ) and connected it to a layer named last of shape (None,26) to finally get the predicted output which represent the 26 letters of English.

CONV->MAXPOOL->CONV->MAXPOOL->reshaped(named full_1)->FULLY_CONNECTED(full_2)->OUTPUT( last )

The training process on earlier(sometime back) running gave numeric values of accuracy but later it started giving NaNs for some unknown reason.

Also, the numeric values of accuracy never increased much throughout the training process and kept very low, which worries me whether I have correctly applied the Convolutional Network because the network should only learn better to give more accuracy as the batches of data are processed into the training process. Is the less accuracy due to less layers and less complex model ?

Also, I am doubtful about the tf.nn.softmax_cross_entropy_with_logits(labels=output,logits=last) statement in my code because the relu function has already been applied on last variable which denote the output layer in my conv net and used above as logits.

The error says : FailedPreconditionError: Attempting to use uninitialized value W_4

The code is :

import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn import preprocessing

import copy
import warnings
warnings.filterwarnings('ignore')

#dataset=pd.read_csv('/Users/ajay/Documents/IpyNote/A_Z Handwritten Data.csv')
dataset=pd.read_csv('../input/handwritten_data_785.csv')

#print(dataset.head(3))
#print(dataset.info())


dataset['0'].unique()
dataset=dataset.astype('float32')
X=copy.deepcopy(dataset)
X.head(1)
Y=X.loc[:,'0']

#print(Y.head(3))
Y=Y.astype('int64')
s=pd.get_dummies(Y)

list(s)

Y=s
Y=Y.astype('float32')
Y.head(2)

X.drop('0',axis=1,inplace=True)
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.25,stratify=Y)

input=tf.placeholder(dtype=tf.float32,shape=(None,28*28))
output=tf.placeholder(dtype=tf.float32,shape=(None,26))
W1=tf.Variable(tf.truncated_normal(shape=(5,5,1,32)),name='W')#28,28,32
b1=tf.Variable(tf.truncated_normal(shape=(1,32)),name='b')#14,14,32

W2=tf.Variable(tf.truncated_normal(shape=(5,5,32,64)),name='W')#14,14,64
b2=tf.Variable(tf.truncated_normal(shape=(1,64)),name='b')#7,7,64

W3=tf.Variable(tf.truncated_normal(shape=(7*7*64,1024)),name='W')
b3=tf.Variable(tf.truncated_normal(shape=(1,1024)),name='b')

W4=tf.Variable(tf.truncated_normal(shape=(1024,26)),name='W')
b4=tf.Variable(tf.truncated_normal(shape=(1,26)),name='b')

def conv(input,W,b):
    return tf.nn.relu(tf.nn.conv2d(input=input,filter=W,strides=(1,1,1,1),padding='SAME')+b)

def maxpool(x):
    return tf.nn.max_pool(value=x,ksize=(1,2,2,1),strides=(1,2,2,1),padding='SAME')

def full_connected(x,W,b):
    return tf.nn.relu(tf.matmul(x,W)+b)

p=tf.reshape(input,[-1,28,28,1])


conv_1=conv(p,W1,b1)
print('conv_1.shape',conv_1.shape)
maxpool_1=maxpool(conv_1)
print('maxpool_1.shape',maxpool_1.shape)
conv_2=conv(maxpool_1,W2,b2)
print('conv_2.shape',conv_2.shape)
maxpool_2=maxpool(conv_2)
print('maxpool_2.shape',maxpool_2.shape)

full_1=tf.reshape(maxpool_2,[-1,7*7*64])
full_2=full_connected(full_1,W3,b3)#full_1->full_2
print('full_2.shape',full_2.shape)

keep_prob=tf.placeholder(tf.float32)
full_2_dropout=tf.nn.dropout(full_2,keep_prob)

last=full_connected(full_2_dropout,W4,b4)
last = tf.clip_by_value(last, 1e-10, 0.9999999)

print('last.shape',last.shape)
loss=tf.nn.softmax_cross_entropy_with_logits(labels=output,logits=last)#loss=tf.nn.softmax(logits=last)

train_step=tf.train.AdamOptimizer(0.005).minimize(loss)
accuracy=tf.reduce_mean(tf.cast(tf.equal(tf.argmax(output,1), tf.argmax(last,1) ) , tf.float32))
init=tf.global_variables_initializer()


with tf.Session() as sess:
    epoch=1
    n_iterations=10
    sess.run(init)
    for i in range(n_iterations):
        j=i*50
        k=i*50+50
        print('j=',j,'k=',k)
        x = X_train.iloc[i*50:j,:]
        y = Y_train.iloc[i*50:j,:]
        #sess.run(accuracy,feed_dict={input:X_train,output:Y_train,keep_prob:1.0})
        print('Train_accuracy : ',sess.run(accuracy, feed_dict={input: x, output: y,keep_prob:1.0}))
        sess.run(train_step,feed_dict={input:x,output:y,keep_prob:1.0})

with tf.Session() as sess:
    n_iterations=20
    for i in range(n_iterations):
        j=i*50
        k=i*50+50
        print('j=',j,'k=',k)
        x = X_test.iloc[i*50:j,:]
        y = Y_test.iloc[i*50:j,:]
        print('Test_accuracy : ',sess.run(accuracy, feed_dict={input: x, output: y,keep_prob:1.0}))

The error is showing something like this:

conv_1.shape (?, 28, 28, 32)
maxpool_1.shape (?, 14, 14, 32)
conv_2.shape (?, 14, 14, 64)
maxpool_2.shape (?, 7, 7, 64)
full_2.shape (?, 1024)
last.shape (?, 26)
j= 0 k= 50
Train_accuracy :  nan
j= 50 k= 100
Train_accuracy :  nan
j= 100 k= 150
Train_accuracy :  nan
j= 150 k= 200
Train_accuracy :  nan
j= 200 k= 250
Train_accuracy :  nan
j= 250 k= 300
Train_accuracy :  nan
j= 300 k= 350
Train_accuracy :  nan
j= 350 k= 400
Train_accuracy :  nan
j= 400 k= 450
Train_accuracy :  nan
j= 450 k= 500
Train_accuracy :  nan
j= 0 k= 50

---------------------------------------------------------------------------
FailedPreconditionError                   Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1329     try:
-> 1330       return fn(*args)
   1331     except errors.OpError as e:

/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1314       return self._call_tf_sessionrun(
-> 1315           options, feed_dict, fetch_list, target_list, run_metadata)
   1316 

/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1422             self._session, options, feed_dict, fetch_list, target_list,
-> 1423             status, run_metadata)
   1424 

/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    515             compat.as_text(c_api.TF_Message(self.status.status)),
--> 516             c_api.TF_GetCode(self.status.status))
    517     # Delete the underlying status object from memory otherwise it stays alive

FailedPreconditionError: Attempting to use uninitialized value W_4
     [[Node: W_4/read = Identity[T=DT_FLOAT, _class=["loc:@W_4"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](W_4)]]

During handling of the above exception, another exception occurred:

FailedPreconditionError                   Traceback (most recent call last)
<ipython-input-2-496ec024fd3b> in <module>()
    114         x = X_test.iloc[i*50:j,:]
    115         y = Y_test.iloc[i*50:j,:]
--> 116         print('Test_accuracy : ',sess.run(accuracy, feed_dict={input: x, output: y,keep_prob:1.0}))

/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    906     try:
    907       result = self._run(None, fetches, feed_dict, options_ptr,
--> 908                          run_metadata_ptr)
    909       if run_metadata:
    910         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1141     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1142       results = self._do_run(handle, final_targets, final_fetches,
-> 1143                              feed_dict_tensor, options, run_metadata)
   1144     else:
   1145       results = []

/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1322     if handle is None:
   1323       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1324                            run_metadata)
   1325     else:
   1326       return self._do_call(_prun_fn, handle, feeds, fetches)

/opt/conda/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1341         except KeyError:
   1342           pass
-> 1343       raise type(e)(node_def, op, message)
   1344 
   1345   def _extend_graph(self):

FailedPreconditionError: Attempting to use uninitialized value W_4
     [[Node: W_4/read = Identity[T=DT_FLOAT, _class=["loc:@W_4"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](W_4)]]

Caused by op 'W_4/read', defined at:
  File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/opt/conda/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/opt/conda/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "/opt/conda/lib/python3.6/site-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/opt/conda/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/opt/conda/lib/python3.6/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/opt/conda/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
    handler(stream, idents, msg)
  File "/opt/conda/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/opt/conda/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/opt/conda/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2698, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2802, in run_ast_nodes
    if self.run_code(code, result):
  File "/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-496ec024fd3b>", line 42, in <module>
    W1=tf.Variable(tf.truncated_normal(shape=(5,5,1,32)),name='W')#28,28,32
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 235, in __init__
    constraint=constraint)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 397, in _init_from_args
    self._snapshot = array_ops.identity(self._variable, name="read")
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 142, in identity
    return gen_array_ops.identity(input, name=name)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3052, in identity
    "Identity", input=input, name=name)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3306, in create_op
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1669, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value W_4
     [[Node: W_4/read = Identity[T=DT_FLOAT, _class=["loc:@W_4"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](W_4)]]

asn
  • 2,408
  • 5
  • 23
  • 37

2 Answers2

3

Reason for Accuracy giving NaNs : You have split the training data into X_train and X_test due to which your indices got disturbed and the train dataset become quite random with respect to the indices and when you feed your X_train batches-wise, the indices from [0:50] do not exist while training and hence you end up feeding nothing to you model.

Before training the model, do this :

X_test.reset_index(drop=True)
Y_test.reset_index(drop=True)

This will reset your indices and drop=True will prevent the original indices from becoming another column in your transformed dataframe.

As far as the Weights and Biases are concerned, DO NOT use another session for testing the model because all your trained variables will be lost in this session and hence the error Attempting to use uninitialized value W_4 will occur.

You can also try saving your variables for the sake of convenience.

Also, refer this for your logits part : here

asn
  • 2,408
  • 5
  • 23
  • 37
1

You can call: sess.run(tf.global_variables_initializer()) to initialize the variables. See this StackOverflow answer for more information about the initializer.

kww
  • 549
  • 3
  • 11
  • Buddy, I've already used it as sess.run(init) while training. – asn Sep 08 '18 at 03:19
  • @Jacob He is right though, you know. You run `sess.run(init)` in your training, true. But then for some reason you open a second session, and in the second session you don't initialize anything, thus the error... Once `with tf.Session() as sess:` ends, the initializations are gone. – BlueSun Sep 10 '18 at 11:31
  • @BlueSun So, how should I save the initializations and why am I getting the accuracy as NaNs in the first `with tf.Session() as sess:` even when the model is getting trained and learning some weights. – asn Sep 10 '18 at 12:02
  • @kww You were right, buddy !! Do edit your answer so that I can upvote it. – asn Sep 10 '18 at 15:19
  • 1
    @Jacob no need to save the initializations, just remove the second `with tf.Session() as sess:`. The NaNs are not related to that, you can track where they occur by using `tf.Print` and `tf.is_nan` – BlueSun Sep 10 '18 at 15:28
  • Yeah, I did exactly this. But, `NaNs` are surely related to this because when I printed the `X_train` and saw the indices change after the `train_test_split` then I tried to reset the indices and it worked well. – asn Sep 10 '18 at 15:34