how to make a model fit the dataset in Keras?

Question

the idea is to make a program that can detect if there is attack happened or not

i got stuck in fitting the model

libraries imported

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
import pandas as pd

Dataset Details:

https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Datasets/bot_iot.php

https://ieee-dataport.org/documents/bot-iot-dataset

files picture

as you can see in attack column i want the program to tell if an attack happened or not

this is the model

model = Sequential()
model.add(Conv1D(128, 5, activation='relu'))
model.add(MaxPooling1D())
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(10,activation='relu'))
model.add(Dense(1,activation='sigmoid'))
model.add(Flatten())

and the model compile

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model fitting part (here is my issue)

model.fit(train, test, epochs=50, batch_size=30)

Error:

ValueError: Data cardinality is ambiguous:
  x sizes: 2934817
  y sizes: 733705
Make sure all arrays contain the same number of samples.

from the error message its clear the files are not the same row quantity

so i tried to take only the test file only and made 2 parts of it the first part

from column 0 to 16

the other is 16

x = test.iloc[:,0:16]
y = test.iloc[:,16]

model.fit(x, y, epochs=50, batch_size=30)

Error:

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type int).

i have tried to make it all as float but it didn't work out still have the same problem

Celius Stingher · Accepted Answer · 2021-11-14T14:36:29.690

The first problem I'm finding is that when using .fit() you need to pass the x and y values, not the train and test sets and that's why you are getting the the error. Keras is trying to predict your full test dataset based on the train dataset which of course makes no sense.

The second error seems like you are passing the right variables to the model (the last column being the target, defined as y and the predictors defined as x) however there seems to be an issue on how the data is formatted. Without access to the data it's hard to solve it. Are all columns numerical? If so, as addressed here this might help do the trick:

x = np.asarray(x).astype('float32')

If the data is not numeric across all entry points, then you might need to some bit of preprocessing in order to ensure it is fully numerical. Some alternatives worth looking into might be:

One hot encoding which can be easily applied from sklearn.
Pandas' get dummies which you can use and pass directly the non-numerical columns.

Once your dataset is all of numerical types, you should be able to use it to train the model without issues.

the data is mix between integers and text, as shown in the picture uploaded (under dataset link) its name is files picture , and i did try to use this trick didnt work out — Abdulaziz Almaawy, Nov 14 '21 at 14:26

how to make a model fit the dataset in Keras?

Dataset Details:

1 Answers1