5

I'm working on testing a model before I let it rip on a full dataset. My data is RGB images in an array, so, my training dataset currently has the dimensions

> dim(ff_train)
[1]  10 500 500   3

So, 10 images, each 500x500 with 3 color layers (RGB).

My test data is the same

> dim(ff_test)
[1]  10 500 500   3

I've setup my model like so:



model <- keras_model_sequential() %>%
  layer_dense(units = 16, activation = "relu", 
              input_shape = c(10)) %>%
  layer_dense(units = 16, activation = "relu") %>%
  layer_dense(units = 1, activation = "sigmoid")

model %>% compile(
  optimizer = "rmsprop",
  loss = "binary_crossentropy",
  metrics = c("accuracy")
)

history <- model %>% fit(
  x = ff_train,
  y = ff_train_labels$fraction_yes,
  epochs = 20,
  validation_data = list(ff_test, ff_test_labels$fraction_yes))

where input shape is 10 as I have 10 images. I also have 10 labels for each which are numbers in a numeric vector between 0 and 1 (fraction of an event occurring in a sample) - both are of length 10.

However, when I run the model, I get the error

 Error in py_call_impl(callable, dots$args, dots$keywords) : 
  ValueError: in user code:

which, after following googling around led me to https://github.com/rstudio/keras/issues/1063 stating that the problem is a mismatch in my dimensions or structure between train and test which.... seems incorrect?

What am I missing here? Where is the dimensional mismatch?

jebyrnes
  • 9,082
  • 5
  • 30
  • 33
  • 5
    There are two points: 1. `input_shape` refers to shape of **each input sample**, i.e. in your example it is `(500, 500, 3)`; 2. the `Dense` layer [is applied on the last axis](https://stackoverflow.com/a/52092176/2099607) (i.e. dimension) of its input. Considering these two points, you should either modify your model architecture, or alternatively change the shape of your input data (e.g. flatten each image to a vector of size 500*500*3). – today Nov 14 '20 at 19:57
  • 1
    Ah, so the dense in this would only be three nodes? Interesting. Given that this is a stacked RGB image, which is structure we want to retain (so the cells of each layer match), would we want to go to a 3x500x500 instead? Note, we are basing this off of the MIST example, so.... – jebyrnes Nov 15 '20 at 21:44
  • 1
    Yes, if you want to only use Dense layer and only predict one single value as the output of the model, then either the input should be flattened or instead you should use a `Flatten` layer somewhere in the model (even as the first layer). – today Nov 16 '20 at 06:54

1 Answers1

2

your input is image, so you need to use Convolution layer as first layer, after which you need flatten layer, this will be before Dense layer.

without flatten layer your output is 4 dimension, but you need 2 dimension output.

  • your output is dimension is 10,500,500,1
  • but you need 10,10

your final dense layer should have 10 neurons.

faheem
  • 634
  • 3
  • 5