0

I'm trying to fit a model in R using Keras. The model seems to be created and compiled without any problems. However, my fit function is giving me the following error:

Error: Data passed to Keras must be a vector, matrix, or array (you passed a data frame)

I've looked at various tutorials and they seem to show the preprocessing very similar to what I'm doing

library(tensorflow)
library(caret)
library(keras)
library(tidyverse)

data <- read.table("wdbc.data", header = FALSE, sep=",", stringsAsFactors = FALSE)
data[data == "?"] <- 0
data[data == "M"] <- 0
data[data == "B"] <- 1
x <- floor(0.6*nrow(data))
set.seed(0)
train_ind <- sample(seq_len(nrow(data)), size = x)

train <- data[train_ind,]
test <- data[-train_ind,]

x_train <- subset(train, select=-c(ID, Class))
y_train <- train[-c(1, 3:32)]
x_test <- subset(test, select=-c(ID, Class))
y_test <- test[-c(1, 3:32)]

max_len <- 3
batch_size <- 64
total_epochs <- 5

set.seed(0)
model <- keras_model_sequential()
model %>%
    layer_embedding(5000, 64, name = "embedding1") %>%
    layer_simple_rnn(64, return_sequences = TRUE, name = "simpleRNN1") %>%
    layer_simple_rnn(64, name = "simpleRNN2") %>%
    layer_dense(1, activation = "sigmoid", name = "dense1")

model %>% compile(loss = 'binary_crossentropy', 
                  optimizer = 'adam', 
                  metrics = c('accuracy'))

x_train <- array_reshape(x_train, c(nrow(x_train), 30))
x_test <- array_reshape(x_test, c(nrow(x_test), 30))

trainModel <- model %>% fit(
    x = x_train,
    y = y_train,
    batch_size = batch_size,
    epochs = total_epochs,
    validation_split = 0.1)

I'm running this in a Jupyter notebook, so trainModel is in it's own cell. When I run that cell, I get the error from above. Here is a more detailed version of the error:

Error: Data passed to Keras must be a vector, matrix, or array (you passed a data frame)
Traceback:

1. model %>% fit(x = x_train, y = y_train, batch_size = batch_size, 
 .     epochs = total_epochs, validation_split = 0.1)
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. fit(., x = x_train, y = y_train, batch_size = batch_size, epochs = total_epochs, 
 .     validation_split = 0.1)
10. fit.keras.engine.training.Model(., x = x_train, y = y_train, 
  .     batch_size = batch_size, epochs = total_epochs, validation_split = 0.1)
11. keras_array(x)
12. stop("Data passed to Keras must be a vector, matrix, or array (you passed a ", 
  .     "data frame)", call. = FALSE)

Data can be found at https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/ The first and second columns were named 'ID' and 'Class'

UseR10085
  • 7,120
  • 3
  • 24
  • 54
Stacey
  • 63
  • 1
  • 9

1 Answers1

0

You can add the following lines

library(tensorflow)
library(caret)
library(keras)
library(tidyverse)

data <- read.table("wdbc.data", header = FALSE, sep=",", stringsAsFactors = FALSE)
x <- floor(0.6*nrow(data))
set.seed(0)
train_ind <- sample(seq_len(nrow(data)), size = x)

train <- data[train_ind,]
test <- data[-train_ind,]

x_train <- subset(train, select=-c(ID, Class))
y_train <- train[-c(1, 3:32)]
x_test <- subset(test, select=-c(ID, Class))
y_test <- test[-c(1, 3:32)]

#Convert the data frame into matrix
y_train <- as.matrix(y_train )
x_train <- as.matrix(x_train)

y_test <- as.matrix(y_test)
x_test <- as.matrix(x_test)

max_len <- 3
batch_size <- 64
total_epochs <- 5
UseR10085
  • 7,120
  • 3
  • 24
  • 54
  • I tried adding those lines, but it give me a different error `Error in py_call_impl(callable, dots$args, dots$keywords): RuntimeError: Evaluation error: invalid argument type.` – Stacey Apr 14 '21 at 17:18
  • I think you have to ask a new question for that. I cannot reproduce your error as you have not provided any data. Please visit [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – UseR10085 Apr 14 '21 at 17:22
  • I've updated the original question to show where the data came from – Stacey Apr 14 '21 at 17:33