ambarihdpyarnCorruption_checksum_mismatch_2">ambari-hdp启动yarn报错Corruption: checksum mismatch
页面报错
Traceback (most recent call last):File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/nodemanager.py", line 102, in <module>Nodemanager().execute()File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in executemethod(env)File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/nodemanager.py", line 53, in startservice('nodemanager',action='start')File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, in thunkreturn fn(*args, **kwargs)File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/service.py", line 110, in serviceExecute(daemon_cmd, not_if=check_process, environment=hadoop_env_exports)File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__self.env.run()File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in runself.run_action(resource, action)File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_actionprovider_action()File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_runreturns=self.resource.returns)File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in innerresult = function(command, **kwargs)File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_calltries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapperresult = _call(command, **kwargs_copy)File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _callraise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-sudo.sh su yarn -l -s /bin/bash -c 'ulimit -c unlimited; /usr/hdp/3.1.5.0-152/hadoop-yarn/bin/yarn --config /usr/hdp/3.1.5.0-152/hadoop/conf --daemon start nodemanager'' returned 1.
后端日志报错
r:/usr/hdp/3.1.5.0-152/tez/lib/tez.tar.gz
STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r e235cd6e7bfe0fe31ee6448b612862c53b45dda9; compiled by 'jenkins' on 2019-12-12T19:46Z
STARTUP_MSG: java = 1.8.0_322
************************************************************/
2024-09-02 15:15:03,986 INFO nodemanager.NodeManager (LogAdapter.java:info(51)) - registered UNIX signal handlers for [TERM, HUP, INT]
2024-09-02 15:15:04,722 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:openDatabase(1540)) - Using state database at /udata/var/log/hadoop-yarn/nodemanager/recovery-state/yarn-nm-state for recovery
2024-09-02 15:15:04,765 INFO service.AbstractService (AbstractService.java:noteFailure(267)) - Service org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService failed in state INITED
org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: checksum mismatchat org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.openDatabase(NMLeveldbStateStoreService.java:1543)at org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1531)at org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:353)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:285)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:358)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
2024-09-02 15:15:04,769 INFO service.AbstractService (AbstractService.java:noteFailure(267)) - Service NodeManager failed in state INITED
org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: checksum mismatchat org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:285)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:358)at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:933)at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1013)
**解决:**从日志中得知是某个文件校验不通过,造成nodemanager无法启动,页面上停了yarn。尝试删除报错的机器上的文件(hadoop-yarn路径地址根据实际配置)如:/var/log/hadoop-yarn/ 重启yarn然后nodemanager正常启动,问题得以解决。