0

am working on an audio classification problem. am using the urbansound8k data set that contains 8732 audio.
I know that kfold is equally splitting data into k groups. each group will be used for testing and the rest will be used for training.

so if k=4, each group will contain 2,183 data. however, this result is far away from my own result

batch_size = 1
num_folds =4
no_epochs = 10

kfold = KFold(n_splits=num_folds, shuffle=False)

for train, test in kfold.split(features, labels):

  
  model = Sequential()
  model.add(Dense(1000, activation='relu'))
  model.add(Dense(no_classes, activation='softmax'))
  



  model.compile(loss=loss_function,
                 optimizer=opt,
                 metrics=['accuracy'])

  history = model.fit(features[train], labels[train],
              batch_size=batch_size,
              epochs=no_epochs,
              verbose=verbosity,
              validation_split=validation_split,shuffle=False)


this code has these results with k=4 :
-5239 per fold when using batch size = 1
-1048 per fold, batch size = 5
-524 per fold, batch size = 10

am not understanding whats the relation between these two parameters: batch size and number of data in a fold.

am ready to share my whole code if required.

Faref
  • 53
  • 8
  • Does this answer your question? [What does KFold in python exactly do?](https://stackoverflow.com/questions/36063014/what-does-kfold-in-python-exactly-do) – Chris Jul 14 '20 at 17:44

1 Answers1

0

Well, if you are interested in the relation, they are, ignoring integer rounding, inversely proportional, i.e.

batch_size * number_of_data_in_fold = some_constant
Captain Trojan
  • 2,800
  • 1
  • 11
  • 28