I am interested in training and evaluating a convolutional neural net model on my own set of images. I want to use the tf.layers
module for my model definition, along with a tf.learn.Estimator
object to train and evaluate the model using the fit()
and evaluate()
methods, respectively.
Here is the tutorial that I have been following, which is helpful for showcasing the tf.layers
module and the tf.learn.Estimator
class. However, the dataset that it uses (MNIST) is simply imported and loaded (as NumPy arrays). See the following main function from the tutorial script:
def main(unused_argv):
# Load training and eval data
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images # Returns np.array
train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)
# Create the Estimator
mnist_classifier = learn.Estimator(
model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model")
# Set up logging for predictions
# Log the values in the "Softmax" tensor with label "probabilities"
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
tensors=tensors_to_log, every_n_iter=50)
# Train the model
mnist_classifier.fit(
x=train_data,
y=train_labels,
batch_size=100,
steps=20000,
monitors=[logging_hook])
# Configure the accuracy metric for evaluation
metrics = {
"accuracy":
learn.MetricSpec(
metric_fn=tf.metrics.accuracy, prediction_key="classes"),
}
# Evaluate the model and print results
eval_results = mnist_classifier.evaluate(
x=eval_data, y=eval_labels, metrics=metrics)
print(eval_results)
Full code here
I have my own images, which I have in both jpg
format within a certain directory structure:
data
train
classA
1.jpg
2.jpg
...
classB
3.jpg
4.jpg
...
...
validate
classA
5.jpg
6.jpg
...
classB
...
...
And I have also converted my image directories into TFRecord format, with one TFRecord file for train
and one for validation
. I followed this tutorial, which uses the build_image_data.py script from the Inception model that comes with TensorFlow as a blackbox that outputs these TFRecord files. I admit that I may have put the cart before the horse a bit by creating these, but I thought that perhaps there was a way to use these as inputs to the tf.learn.Estimator
's fit()
and evaluate()
methods.
Questions
How can I format my jpg
(or TFRecord) data so that I can use them as inputs to the Estimator
object's functions?
I'm assuming I have to convert my images and labels to NumPy arrays, as it shows in the code above, however, it is not clear how the mnist.train.images
and mnist.train.validation
are formatted.
Does anyone have any experience with converting jpg
files and labels to NumPy arrays that this Estimator
class expects as inputs?
Any help would be greatly appreciated.