0

I'm trying to use CSVLogger in my Network training, but I have the following error at the end of my epoch

2022-06-05 12:55:34 - ERROR - 'utf-8' codec can't decode byte 0x92 in position 144: invalid start byte Traceback (most recent call last):
File "D:\Users\Username\project\simulations\trainer.py", line 188, in train_network history = compile_and_fit(
File "D:\Users\Username\project\neural\neural_utils.py", line 77, in compile_and_fit history = model.fit(
File "D:\Users\Username\project\venv\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None
File "C:\Python39\lib\csv.py", line 143, in writeheader return self.writerow(header) File "C:\Python39\lib\csv.py", line 154, in writerow return self.writer.writerow(self._dict_to_list(rowdict))

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 144: invalid start byte

These are my system information:

  • System: Windows 11
  • Tensorflow version : 2.9.1
  • Python version: 3.9

This is my implementation :

early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_loss',
    patience=5,
    mode='min',
    restore_best_weights=True
)

csv_logger = tf.keras.callbacks.CSVLogger(
    Path(output_files_directory, 'training.csv'),
    separator=',',
    append=False
)

model.compile(
    loss=tf.losses.MeanSquaredError(),
    optimizer=tf.optimizers.SGD(nesterov=True),
    metrics=[tf.metrics.MeanSquaredError(), tf.metrics.Accuracy()]
)

# Print model summary
print('Model summary: ')
model.summary()

history = model.fit(
    window.train, epochs=nb_epochs,
    validation_data=window.val,
    callbacks=[
        early_stopping,
        cp_callback,
        csv_logger
    ]
)

Note that the output path contains spaces (I cannot change the output path, it's fixed by a previous script). For example in the previous script output_files_directory is equal toD:\trained_data\output\16 bits - Original - 1.25MHz\LSTM_32

graille
  • 1,131
  • 2
  • 14
  • 33
  • Can it be that `training.csv` contains some non-Unicode characters? Your error seems to imply something similar – Yannis P. Jun 05 '22 at 14:41
  • The file actually doesn't exist when the script is launched – graille Jun 05 '22 at 19:58
  • I hope you have solved it. I am not familiar with the `TensorFlow API` but it seems that your epoch results or something contain some characters that need to be encoded. See [here](https://stackoverflow.com/questions/46000191/utf-8-codec-cant-decode-byte-0x92-in-position-18-invalid-start-byte) for a similar example – Yannis P. Jun 06 '22 at 17:15

1 Answers1

0

Ok so I figured out what was the issue: The output directory didn't exist.

It seems that callbacks like tf.keras.callbacks.ModelCheckpoint create the output directory if it doesn't exist, but not tf.keras.callbacks.CSVLogger

graille
  • 1,131
  • 2
  • 14
  • 33