When attempting to train a model using the test Movielens dataset and the instructions here. I get the following errors:
./mltrain.sh local ../data u.data
Mon Jul 6 08:53:06 UTC 2020
WARNING:tensorflow:From /home/XXXXXXXX/tensorflow-recommendation-wals/wals_ml_engine/trainer/task.py:176: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
WARNING:tensorflow:From /home/XXXXXXXX/tensorflow-recommendation-wals/wals_ml_engine/trainer/task.py:176: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.
WARNING:tensorflow:From trainer/model.py:267: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.
INFO:tensorflow:Train Start: 2020-07-06 08:53:14
trainer/wals.py:94: RuntimeWarning: divide by zero encountered in divide
frac = np.array(1.0/(data > 0.0).sum(axis))
WARNING:tensorflow:From trainer/wals.py:57: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2020-07-06 08:53:14.592099: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-07-06 08:53:14.592138: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2020-07-06 08:53:14.592170: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (sn-ga-recommender-model): /proc/driver/nvidia/version does not exist
2020-07-06 08:53:14.592473: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-07-06 08:53:14.599363: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2000160000 Hz
2020-07-06 08:53:14.599627: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x564d65170fd0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-06 08:53:14.599656: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
INFO:tensorflow:Train Finish: 2020-07-06 08:53:15
INFO:tensorflow:train RMSE = 0.89
INFO:tensorflow:test RMSE = 1.07
Mon Jul 6 08:53:15 UTC 2020