0

In this code (courtesy to this answer):

from PIL import Image
import numpy as np


def load_image(infilename):
    img = Image.open(infilename)
    img.load()
    data = np.asarray(img, dtype="int32")
    return data


def save_image(npdata, outfilename):
    img = Image.fromarray(np.asarray(np.clip(npdata, 0, 255), dtype="uint8"), "L")
    img.save(outfilename)

data = load_image('cat.0.jpg')
print(data.shape)

The value of print(data.shape) is a tuple of three dim (374, 500, 3). Thus, I have these questions:

  1. What does this tuple represent?
  2. To be used for machine learning classification purpose, does it make sense to convert such a tuple data into one dimension vector? If so, how?

Thank you very much.

Medo
  • 952
  • 3
  • 11
  • 22
  • Have you had a look at any of the many image classification examples online? – Nils Werner Jun 01 '18 at 22:17
  • @NilsWerner If you mean `convolutional neural network`, yes I have. I just want to know if such transformation to one vector is logical and does not destroy the meaning of the image – Medo Jun 01 '18 at 22:19

2 Answers2

1

The dimensions are: (row,col,channel) Yes it often makes sense to feed a 1D array into a Neural Net, for example if you use a fully connected network. To reshape you have multiple options:

  1. Use the reshape function

    data.reshape(-1)

  2. Use the flatten function

    data.flatten()

Jonathan R
  • 3,652
  • 3
  • 22
  • 40
0
  1. 374 rows of 500 columns of RGB (3) values (or pixels), or some permutation of these dimensions.

  2. Maybe. Though remember that any 1D encoding of this discards two-dimensional distance information between different pixels. If you're working with neural networks, look into convolutional neural networks to see how they deal with this problem.

itdoesntwork
  • 4,666
  • 3
  • 25
  • 38
  • Thank you for the clarification. I want to use fully connected network of NN instead of convolutional neural networks. that's way I was asking about dim transformation here – Medo Jun 01 '18 at 22:23