I would like to use sklearn.metrics.classification_report but I cannot use it directly as my testing data is provided by a python generator. The python generator is given below.
Let me give you the context. I am doing speech recognition, and I am using generators to deal with memory issues. Basically, the generator loads some batchs which contain some numpy array that corresponds to some spectograms. Each batch has a unique id, so that it is pretty simple to get both input and output batch.
I am using tensorflow.keras to build my neural network. So, until now, I am able to use Model.fit (since a given version, fit_generator is deprecated) to train my model, and Model.evaluate_generator to estimate the generalization error with common numerical metrics (macro f1 score made from keras.backend, accuracy ...). But, I have no idea how to use keras.backend to code classification report and it in my list of metrics while compiling the model.
Does someone have an idea ? Any help would be appreciated ! Feel free to ask for more details and comment my code. I would love to read your tips, methodology and so on.
Many thanks !
class Pipeline:
# some code ...
def generator(self, set_name):
# set_name : 'training', 'validation', or 'testing'
input_paths = glob.glob('{}/{}/batch_input__*.npz'.format(self.path, set_name))
random.shuffle(input_paths)
num_inputs = len(input_paths)
while True:
for i in range(num_inputs):
input_filepath = input_paths[i]
batch_id = input_filepath.split('__')[-1].split('.')[0]
output_filepath = '{}/{}/batch_output__{}.npz'.format(self.path, set_name, batch_id)
features = np.load(input_filepath)['arr_0']
labels = np.load(output_filepath)['arr_0']
yield features, labels