Yarn version: 3.1.1 HDP version: 3.1.5
Permissions are fine on the /var/log/ directory itself. (Even tried 777 to ensure it could write, the error still happens)
Disk space is also fine - maybe is a connectivity issue to the disk? (although its the root volume, so not sure how that would happen)
I can restart the nodemanagers manually and they will proceed running any jobs as usual without complaining - is there a way for the restart to trigger automatically if a nodemanager is found unhealthy? (this might be a good workaround)
2023-08-30 20:51:57,534 ERROR recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:markStoreUnHealthy(206)) - Statestore exception:
org.iq80.leveldb.DBException: IO error: /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/035781.log: Permission denied
at org.fusesource.leveldbjni.internal.JniDB.put(JniDB.java:129)
at org.fusesource.leveldbjni.internal.JniDB.put(JniDB.java:106)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.storeMasterKey(NMLeveldbStateStoreService.java:1112)
at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.storeContainerTokenPreviousMasterKey(NMLeveldbStateStoreService.java:1178)
at org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager.updatePreviousMasterKey(NMContainerTokenSecretManager.java:120)
at org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager.setMasterKey(NMContainerTokenSecretManager.java:141)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$StatusUpdaterRunnable.updateMasterKeys(NodeStatusUpdaterImpl.java:1255)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$StatusUpdaterRunnable.run(NodeStatusUpdaterImpl.java:1099)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state/035781.log: Permission denied
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
at org.fusesource.leveldbjni.internal.NativeDB.put(NativeDB.java:259)
at org.fusesource.leveldbjni.internal.NativeDB.put(NativeDB.java:254)
at org.fusesource.leveldbjni.internal.NativeDB.put(NativeDB.java:244)
at org.fusesource.leveldbjni.internal.JniDB.put(JniDB.java:126)
... 8 more