1

I have a large number of JPGs representing vehicles. I want to create a dataset for TensorFlow with a categorization such that every vehicle image describes the side, the angle or the roof, i.e. I want to create nine subsets of images (front, back, driver side, driver front angle, driver back angle, passenger side, passenger front angle, passenge back angle, roof). At the moment the filename of each JPG describes the desired point.

How can I turn this set to be a dataset that TensorFlow can easily manipulate? Also, should I run a procedure which crop the JPG to extract only the vehicle portion? How could I do that using TensorFlow?

I apologize in advance for not providing details and examples to this question, but I don't really know how can I achieve an entry point for this problem. The tutorials I'm following all assume an already created dataset ready to use.

Dree
  • 702
  • 9
  • 29
  • If i understand you correctly you have 9 images for each vehicle, and want to make some prediction based on all 9 images at once? – FinleyGibson Sep 16 '19 at 15:40
  • Not really. What I need is a model that recognizes the vehicle side/angle of future submitted images based on the knowledge I’d like to build with categorization I explained above. Please tell me if it’s not clear. – Dree Sep 16 '19 at 18:31
  • I think i get it. So you want to put in an image and for the network to tell you which of these 9 categories it belongs to: [front, back, driver side, driver front angle, driver back angle, passenger side, passenger front angle, passenger back angle, roof]? – FinleyGibson Sep 16 '19 at 18:35
  • Could you give me a few examples of the file names? – FinleyGibson Sep 16 '19 at 18:36
  • And are all the images the same size? (same number of pixels) – FinleyGibson Sep 16 '19 at 18:37
  • @FinleyGibson Correct, that's what I wish to achieve. Possible filenames are: *front_center*, *front_right*, *front_left*, *back_center*, *back_right*, *back_left*, *lat_right*, *lat_left*, *lat_top*. Images are approx. all of the same size, but it cannot be said with absolute certainty. – Dree Sep 17 '19 at 07:23
  • If I understand it correctly, given an image you want to ML model to classify it as one of the nine categories. If you have good amount of data, you can create 9 folders for nine categories, put images into the 9 folders, and follow the transfer learning approach mentioned [here](https://codelabs.developers.google.com/codelabs/recognize-flowers-with-tensorflow-on-android/#1) and [code here](https://colab.sandbox.google.com/github/tensorflow/examples/blob/master/community/en/flowers_tf_lite.ipynb) to develop a classification model. You need to update the above code. – Vishnuvardhan Janapati Mar 30 '20 at 16:20
  • You don't need to resize images. The TensorFlow code mentioned in the above response will resize each image. One suggestion is to use good quality images for better model. Let me know if you have any questions. – Vishnuvardhan Janapati Mar 30 '20 at 16:21

1 Answers1

0

Okay, I'm going to try to answer this as well as I can, but producing and pre-processing data for use in ML algorithms is laborious and often expensive (hence the repeated use of well known data sets for testing algorithm designs).

To address a few straight-forward questions first:

should I run a procedure which crop the JPG to extract only the vehicle portion?

No. This isn't necessary. The neural network will sort the relevant information in the images from the irrelevant itself and having a diverse set of images will help to build a robust classifier. Also you would likely make life a lot more difficult for yourself later on by resizing images (see point 1. below for more).

How could I do that using TensorFlow?

You wouldn't. Tensorflow is designed to build and test ML models, and does not have tool for pre-processing data. (well perhaps TensorFlow Extended does, but this shouldn't be necessary)

Now a rough guideline for how you would go about creating a data set from the files described:

1) The first thing you will need to do is to load your .jpg images into python and resize them all to be identical. A neural network will need the same number of inputs (pixels in this case) in every training example, so having different sized images will not work.

  • There is a good answer detailing how to load images using python image library (PIL) on stack overflow here.
  • The PIL image instances (elements of the list loadedImages in the example above) can then be converted to numpy arrays using data = np.asarray(image), which tensorflow can work with.
  • In addition to building a set of numpy arrays of your data, you will also need a second numpy array of labels for this data. A typical way to encode this will be as a numpy array the same length as your number of images with an integer value for each point representing the class to which that image belongs (0-8 for your 9 classes). You could input these by hand, but this will be labour intensive, and I would suggest using python strings inbuilt find method to locate key words within the filenames to automate determining their class. This could be done within the

    for image in imagesList:
    

    loop in the above link, as image should be a string containing the image filename.

    • As I mentioned above, resizing the images is necessary to make sure they are all identical. You could do this with numpy, using indexing to choose a subsection of each image array, or using PIL's resize function before converting to numpy. There is no right answer here, and many methods have been used to resize images for this purpose, from padding, to stretching to cropping.

Then end result here should be 2 numpy arrays. One of image data which has shape [w,h,3,n] where w=image width, h=image height, 3 = the three RGB layers (provided images are in colour) and n= the number of images you have. The second of labels associated with these images, of shape [n,] where every element of the length n array is an integer from 0-8 specifying its class.

At this point it would be a good idea to save the dataset in this format using numpy.save() so that you don't have to go through this process again.

2) Once you have your images in this format, tensorflow has a class called tf.Dataset into which you can load the image and label data described above and will allow you to shuffle and sample data from it.

I hope that was helpful, and I am sorry that there is no quick-fix solution to this (at least not one I am aware of). Good luck.

FinleyGibson
  • 911
  • 5
  • 18
  • Thank you for the detailed answer, I really appreciate that. I managed to create a sample of 100 dicrectories, each of them containing nine pictures of a vehicle in each of the nine perspectives. I'd like to ask you just a couple of questions. 1. Is it OK to return the result of each iteration as a tuple (images numpy array and labels numpy array)? 2. How could I aggregate the results. Is it OK to merge every single tuple, right? – Dree Sep 17 '19 at 15:03