SAP HANA Spark Controller(SHSC) Kerberos token失效问题

news/2024/11/14 13:50:02/

问题描述:

SAP HANA Spark Controller(2.4.4)连接HDFS集群失败,hana_controller.log 日志显示以下报错:
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1814

分析与建议

该问题是kerberos的问题,与SHSC无关
建议升级至 SHSC 最新(当前是2.6.0 版本)

SAP 开发的解释如下

Kerberos (SHSC)

The SHSC uses the SparkContext as the entry to leverage Spark and subsequently connecting to Hive. And the related Kerberos configuration in a Hadoop
cluster for a long running Spark applications is pretty simple. Please see below references for further details:

Long-Running Applications

Configuring Spark on YARN for Long-Running Applications
SHSC only has to provide the spark.yarn.keytab and spark.yarn.principal for the process.

However, since SHSC doesn’t utilize the spark-submit entry point, but SparkContext as an entry point (also a valid method), the tokens aren’t renewed
automatically. Unlike spark-submit jobs, SHSC has to renew it manually within the existing SparkContext which SHSC does periodically.

As stated explained in SAP Note 3004055 SHSC doesn’t perform any sort of authentication or authorization, but only passes on the required Kerberos information
such as keytab, principal, and tokens.

In addition, you may also want to read Hadoop Delegation Tokens Explained.

Error Message:

In regards to the error messages that can be found from the SHSC logs. Those are written/reported very clearly by hadoop not SHSC
as can be seen from the call stack information such as org.apache.hadoop.* org.apache.hadoop.security.authentication.client.AuthenticationException org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1814)Please refer to the attchment called "SHSC_KerberosSequence_Error.png" illustrating the calling sequence and occurrance of the 
Kerberos issue. As one can see the authentication was already performed successfully from SHSC throughout the involved Hadoop components.
During query execution from Spark to Hadoop/HDFS the authentication issue occurrs with below call stack. 22/11/08 11:53:23 DEBUG SAPHanaSpark-akka.actor.default-dispatcher-32 security.UserGroupInformation: PrivilegedActionException as:hanaes (auth:PROXY) via hanaes/sunpaphmp03.sun.nec.co.jp@SUN.NEC.CO.JP (auth:KERBEROS) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
22/11/08 11:53:23 ERROR SAPHanaSpark-akka.actor.default-dispatcher-32 network.AsyncExecutor: Exception happened in Async Executor. Forwarding to Handlerjava.io.IOException: DestHost:destPort sunpaphmp02.sun.nec.co.jp:8020 , LocalHost:localPort SUNPAPHMP03.sun.nec.co.jp/10.179.8.32:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at java.lang.reflect.Constructor.newInstance(Constructor.java:423)at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1501)at org.apache.hadoop.ipc.Client.call(Client.java:1443)at org.apache.hadoop.ipc.Client.call(Client.java:1353)at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)at com.sun.proxy.$Proxy9.getDelegationToken(Unknown Source)at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1071)at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)at com.sun.proxy.$Proxy10.getDelegationToken(Unknown Source)at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:700)at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1814)at org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer.java:95)at org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(DelegationTokenIssuer.java:76)at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:216)at org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.listStatus(AvroContainerInputFormat.java:42)at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)at scala.Option.getOrElse(Option.scala:121)at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)at org.apache.spark.sql.hana.DistributedDataSetImpl.coalesceToMaxPartitions(DistributedDataSetFactoryImpl.scala:178)at org.apache.spark.sql.hana.DistributedDataSetImpl.transferDatafromPartitions(DistributedDataSetFactoryImpl.scala:189)at com.sap.hana.spark.SparkFacade.dispatchQueryResultFn(SparkFacade.scala:302)at com.sap.hana.spark.SparkFacade.executeQueryAsync(SparkFacade.scala:395)at com.sap.hana.network.AsyncExecutor.executeJob(RequestRouter.scala:97)at com.sap.hana.network.AsyncExecutor$$anonfun$receive$1$$anon$2.run(RequestRouter.scala:53)at com.sap.hana.network.AsyncExecutor$$anonfun$receive$1$$anon$2.run(RequestRouter.scala:50)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:360)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)at com.sap.hana.network.AsyncExecutor$$anonfun$receive$1.applyOrElse(RequestRouter.scala:50)at akka.actor.Actor$class.aroundReceive(Actor.scala:465)at com.sap.hana.network.AsyncExecutor.aroundReceive(RequestRouter.scala:40)at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)at akka.actor.ActorCell.invoke(ActorCell.scala:487)at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)at akka.dispatch.Mailbox.run(Mailbox.scala:220)at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:757)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:720)at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:813)at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:410)at org.apache.hadoop.ipc.Client.getConnection(Client.java:1558)at org.apache.hadoop.ipc.Client.call(Client.java:1389)... 81 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:173)at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390)at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:614)at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:410)at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:800)at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:796)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:796)... 84 more

If there would be an issue with missing tokens in a cache (which SHSC also doesn’t have) there should be related error messages
such as:

HDFS_DELEGATION_TOKEN can’t be found in cache
And the call stack also actually shows:

    at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:143)at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:102)at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)t 

http://www.ppmy.cn/news/9465.html

相关文章

【自学Python】Python HelloWorld

Windows Python HelloWorld Windows Python HelloWorld 教程 使用记事本,新建一个 helloworld.py 文件,输入以下内容: print(嗨客网(www.haicoder.net))打开命令行,进入到 helloworld.py 文件所在目录,输入以下命令…

大数据挖掘-伤寒论和金匮要略(COVID-19用药启示录,1.4万字收藏)

来自Toby老师,大数据挖掘-伤寒论和金匮要略 大家好,我是Toby老师,三年来新冠病毒肆虐全球,带来一些列症状,例如发热,恶寒,咳嗽,咽喉痛,腹泻,心脑血管疾病等…

电商系统概要设计

设计一个最小化的电商系统,理清楚电商系统的架构。目前唯一能确定的是,老板要做一个电商系统。具体做成什么样,还不清楚。你需要和老板讨论需求。 你:“咱们要做的业务模式是C2C、B2C还是B2B呢?” 老板:“…

元数据管理Datahub基于Docker进行部署

目录1. 服务器要求2. 安装Docker3. 安装jq4. 安装python35. 安装docker-compose v1(deprecated,为了兼容性)5.1 安装virtualenv5.2 安装docker-compose6. 安装datahub(在docker-compose-v1-py虚拟环境下)7. 访问Web页面,然后导入测试元数据8. 删除datahu…

移位操作符和位操作符(从概念到相关算法题详解)

目录 概念 基础知识 左移操作符(<<) 右移操作符(>>) 按位与(&) 按位或(|) 异或(^) 相关算法题 1.不能创建临时变量(第三个变量),实现俩个数的交换 方法1: 方法2: 写一个方法,返回参数中二进制中1的个数 方法1: 方法2: 方法3: 俩个int(32位)整…

二叉树——链式存储

✅<1>主页&#xff1a;我的代码爱吃辣 &#x1f4c3;<2>知识讲解&#xff1a;数据结构——二叉树 &#x1f525;<3>创作者&#xff1a;我的代码爱吃辣 ☂️<4>开发环境&#xff1a;Visual Studio 2022 &#x1f4ac;<5>前言&#xff1a;上期讲了…

【漏洞复现】钉钉rce反弹shell

文章目录 一、漏洞描述二、漏洞原理三、影响版本四、复现过程0.环境说明1.msf 生成shellcode2.msf开启监听3.将生成的shellcode替换原shellcode4.开启web服务&#xff0c;并上传poc文件&#xff0c;构造poc5.从钉钉发送poc给受害者6.受害者点击即会触发漏洞&#xff0c;在msf监…

⚡️【linux】linux编辑器-VIM的高频使用,快快收藏起来!

⚡️目录 1️⃣VIM最小集 2️⃣VIM指令集 3️⃣VIM的配置 &#x1f332;前言&#xff1a;VIM和VI的区别简单点来讲&#xff0c;他们都是多模式编辑器&#xff0c;不同的是VIM是VI的升级版本&#xff0c;它不仅兼容VI的所有指令&#xff0c;而且还有一些新的特征在里面。例如语…