I searched a correct answer to my problem during a long time (many hours) without result, so here I'm. I think I'm missing something obvious, but I can't know what...
problem : using queue for read a CSV file and train a Estimator with the input_fn without reload Graph everytime (which is very slow).
I create a custom model which give me a model_fn function for create my own estimator:
tf.estimator.Estimator(model_fn=model_fn, params=model_params)
After that, I need to read a very large CSV file (can't be load in memory), so I decided to use Queue (seems to be the best solution):
nb_features = 10
queue = tf.train.string_input_producer(["test.csv"],
shuffle=False)
reader = tf.TextLineReader()
key, value = reader.read(queue)
record_defaults = [[0] for _ in range(nb_features+1)]
cols = tf.decode_csv(value, record_defaults=record_defaults)
features = tf.stack(cols[0:len(cols)-1]) # Take all columns without the last
label = tf.stack(cols[len(cols)-1]) # Take last column
I think this code is ok.
Then, the main code:
with tf.Session() as sess:
tf.logging.set_verbosity(tf.logging.INFO)
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# Return a Tensor of 1000 features/labels
def get_inputs():
print("input call !")
xs = []
ys = []
for i in range(1000):
x, y = sess.run([features, label])
xs.append(x)
ys.append(y)
return tf.constant(np.asarray(xs), dtype=tf.float32), tf.constant(np.asarray(ys))
estimator.train(input_fn=get_inputs,
steps=100)
coord.request_stop()
coord.join(threads)
As you can see, there is a lot of ugly things here...
What I want : I want the train function to use a new batch of features at each steps. But here, it use the same batch of 1000 features during the 100 steps because the get_inputs function is just call when we start the training. Is there a easy way to do this ?
I try to loop the estimator.train with step=1, but this reload the graph everytime and become very slow.
I don't know what to do now and don't know if it's even possible..
Thanks for helping me !