KeeperErrorCode = NoAuth for /hbase/tokenauth/keys

news/2024/11/29 13:29:51/

kerberos配置hbase出現問題


環境如下:
在这里插入图片描述

在这里插入图片描述

问题描述

想要在hadoop ha的場景上,基於kerberos配置hbase ha,出現了如下的bug

org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase/runningat org.apache.zookeeper.KeeperException.create(KeeperException.java:113)at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1212)at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:340)at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:661)at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:637)at org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.nodeCreated(ZKNodeTracker.java:199)at org.apache.hadoop.hbase.zookeeper.ZKWatcher.process(ZKWatcher.java:460)at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
2023-06-23 16:19:56,035 ERROR [main-EventThread] zookeeper.ZKWatcher: regionserver:16020-0x3029dc0d4ec0021, quorum=hadoop102:2181,hadoop103:2181,hadoop104:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase/runningat org.apache.zookeeper.KeeperException.create(KeeperException.java:113)at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1212)at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:340)at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:661)at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:637)at org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.nodeCreated(ZKNodeTracker.java:199)at org.apache.hadoop.hbase.zookeeper.ZKWatcher.process(ZKWatcher.java:460)at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
2023-06-23 16:19:56,038 ERROR [main-EventThread] regionserver.HRegionServer: ***** ABORTING region server hadoop102,16020,1687508213386: Unexpected exception handling nodeCreated event *****
org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase/runningat org.apache.zookeeper.KeeperException.create(KeeperException.java:113)at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1212)at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:340)at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:661)at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:637)at org.apache.hadoop.hbase.zookeeper.ZKNodeTracker.nodeCreated(ZKNodeTracker.java:199)at org.apache.hadoop.hbase.zookeeper.ZKWatcher.process(ZKWatcher.java:460)at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
2023-06-23 16:19:56,041 ERROR [main-EventThread] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
2023-06-23 16:19:56,060 INFO  [main-EventThread] regionserver.HRegionServer:"exceptions.ScannerResetException" : 0,

原因分析:

這個問題昨天卡了一天,我發現就是在只有在hadoop102幾點上啟動了master和regionserver,我通過hadop102:16010 web頁面訪問,發現是regionserver是dead,所以後面就一直卡死在這裡了
在这里插入图片描述

通過bug日誌看,可以知道是kerberos權限認證的問題,看了一下當下的配置文件

hadoop102 : vim hbase-jaas.conf

Client {com.sun.security.auth.module.Krb5LoginModule requireduseKeyTab=truekeyTab="/etc/security/keytab/hbase.service.keytab"useTicketCache=falseprincipal="hbase/hadoop102@EXAMPLE.COM";
};

hadoop103 : vim hbase-jaas.conf

Client {com.sun.security.auth.module.Krb5LoginModule requireduseKeyTab=truekeyTab="/etc/security/keytab/hbase.service.keytab"useTicketCache=falseprincipal="hbase/hadoop103@EXAMPLE.COM";
};

hadoop104 : vim hbase-jaas.conf

Client {com.sun.security.auth.module.Krb5LoginModule requireduseKeyTab=truekeyTab="/etc/security/keytab/hbase.service.keytab"useTicketCache=falseprincipal="hbase/hadoop104@EXAMPLE.COM";
};

我發現只有hadoop102上的master和regionserver啟動起來了,hadoop103和hadoop104沒有啟動,所以下意識就把hadoop103和hadoop104的hbase-jaas.conf配置文件改成了和hadoop102一樣的,重啟hbase,發現所有的服務是可以啟動的,但是無法執行hbase的插入語句

base(main):002:0> create 'student','info'ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializingat org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2946)at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1942)at org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:603)at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:Create a table with namespace=ns1 and table qualifier=t1hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}Create a table with namespace=default and table qualifier=t1hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}hbase> # The above in shorthand would be the following:hbase> create 't1', 'f1', 'f2', 'f3'hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => 'weekly'}Table configuration options can be put at the end.
Examples:hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }hbase> # Optionally pre-split the table into NUMREGIONS, usinghbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemhbase> create 't1', {NAME => 'f1', DFS_REPLICATION => 1}You can also keep around a reference to the created table:hbase> t1 = create 't1', 'f1'Which gives you a reference to the table named 't1', on which you can then
call methods.Took 8.8778 seconds                                                                                                                                                 
hbase(main):003:0>  put 'student','1001','info:sex','male'ERROR: org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not online on hadoop102,16020,1687510685378at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3272)at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3249)at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2429)at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)Put a cell 'value' at specified table/row/column and optionally
timestamp coordinates.  To put a cell value into table 'ns1:t1' or 't1'
at row 'r1' under column 'c1' marked with the time 'ts1', do:hbase> put 'ns1:t1', 'r1', 'c1', 'value'hbase> put 't1', 'r1', 'c1', 'value'hbase> put 't1', 'r1', 'c1', 'value', ts1hbase> put 't1', 'r1', 'c1', 'value', {ATTRIBUTES=>{'mykey'=>'myvalue'}}hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}hbase> put 't1', 'r1', 'c1', 'value', ts1, {VISIBILITY=>'PRIVATE|SECRET'}The same commands also can be run on a table reference. Suppose you had a reference
t to table 't1', the corresponding command would be:hbase> t.put 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}

解决方案:

提示:这里填写该问题的具体解决方案:

看到這裡我發現所有節點的regionserver都沒有正常啟動,全是dead狀態,所以我就猜測是zookeeper中的hbase數據損壞導致的,所以就想把zookeeper中的hbase信息刪除

[zk: hadoop102:2181(CONNECTED) 0] ls
ls [-s] [-w] [-R] path
[zk: hadoop102:2181(CONNECTED) 1] ls /
[dolphinscheduler, hadoop-ha, hbase, rmstore, yarn-leader-election, zookeeper]
[zk: hadoop102:2181(CONNECTED) 2] deleteall /hbase
Authentication is not valid : /hbase/replication
[zk: hadoop102:2181(CONNECTED) 3] getAcl /hbase
'sasl,'hbase/hadoop102@EXAMPLE.COM
: cdrwa

發現刪除失敗,一直再報Authentication is not valid : /hbase/replication這個bug,這個是由於zookeeper开启了ACL導致的,最後的解決方案是在zookeeper的配置文件zoo.cfg中加入一行skipACL=yes

#kerberos认证配置
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
jaasLoginRenew=3600000
sessionRequireClientSASLAuth=true
skipACL=yes

分發zoo.cfg到zk所有節點,重啟zookeeper,再刪除/hbase節點數據

[zk: hadoop102:2181(CONNECTED) 0] ls /
[dolphinscheduler, hadoop-ha, hbase, rmstore, yarn-leader-election, zookeeper]
[zk: hadoop102:2181(CONNECTED) 1] deleteall /hbase
[zk: hadoop102:2181(CONNECTED) 2] ls /
[dolphinscheduler, hadoop-ha, rmstore, yarn-leader-election, zookeeper]
[zk: hadoop102:2181(CONNECTED) 3] quit;
ZooKeeper -server host:port cmd args

成功刪除!!!
到這裡的時候基本上就已經解決成功了
為了保險起見,我把hdfs上的hbase所有文件也刪除了

hadoop fs -rm -r -f /hbase/*

刪除zoo.cfg中的skipACL=yes,然後重啟zk,重啟hbase,訪問hadoop102:16010 web網頁:
在这里插入图片描述
可以看到已經沒有dead regionserver了
再執行hbase 插入語句

hbase(main):001:0> create 'student','info'
Created table student
Took 2.6728 seconds                                                                                                                                                
=> Hbase::Table - student
hbase(main):002:0> put 'student','1001','info:sex','male'
Took 0.1907 seconds                                                                                                                                                
hbase(main):003:0> put 'student','1001','info:age','18'
Took 0.0055 seconds                                                                                                                                                
hbase(main):004:0>  scan 'student'
ROW                                       COLUMN+CELL                                                                                                              1001                                     column=info:age, timestamp=1687568561569, value=18                                                                       1001                                     column=info:sex, timestamp=1687568556688, value=male                                                                     
1 row(s)
Took 0.0611 seconds                                                                                                                                                
hbase(main):005:0> scan 'student',{STARTROW => '1001', STOPROW  => '1001'}
ROW                                       COLUMN+CELL                                                                                                              1001                                     column=info:age, timestamp=1687568561569, value=18                                                                       1001                                     column=info:sex, timestamp=1687568556688, value=male                                                                     
1 row(s)
Took 0.0131 seconds                                                                                                                                                
hbase(main):006:0> describe 'student'
Table student is ENABLED                                                                                                                                           
student                                                                                                                                                            
COLUMN FAMILIES DESCRIPTION                                                                                                                                        
{NAME => 'info', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false',DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY =
> 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                
1 row(s)
Took 0.0590 seconds                                                                                                                                                
hbase(main):007:0> quit

至此,bug已經解決完

總結:

解決的bug一共有這麼幾個地方:

1.更改所有節點的 hbase-jaas.conf,保持和hadoop102一致

hadoop102 : vim hbase-jaas.conf

Client {com.sun.security.auth.module.Krb5LoginModule requireduseKeyTab=truekeyTab="/etc/security/keytab/hbase.service.keytab"useTicketCache=falseprincipal="hbase/hadoop102@EXAMPLE.COM";
};

2.刪除zookeeper中的/hbase數據

在zoo.cfg中加入skipACL=yes,再重啟zk,然後刪除/hbase

3.刪除hdfs上hbase舊數據

hdfs fs -rm -r -f /hbase/*

4.重啟hbase,執行建表和數據插入語句

參考:https://zhuanlan.zhihu.com/p/396007109


http://www.ppmy.cn/news/545924.html

相关文章

mysql workbench crows foot_佳句赏析“英文人生名言:1、A road of a thousand ...先洗浄你的手指,再指出我的污迹。” - 金句吧...

一键配图 英文人生名言1、A road of a thousand miles begins with one step. 2、Calamity and prisperity are the touchstones of integrity. 不幸与幸运都是正直的试金石。 3、Don"t throw stones at your neightbours,if your own windows are glass. 假如你家的窗户是…

“我要金手指”——由模式谈对象对象的基本原则之依赖颠倒原则

“我要金手指” ——由模式谈对象对象的基本原则之依赖颠倒原则 传说有一天,神看到一个乞丐,动了怜悯之心。他对乞丐说,我将满足你的一个愿望,你要什么我会给你什么。说罢&#…

铂金戒指的选购方法

一、注意铂金和白色K金有什么区别? 铂金的白色是天然的,而白色K金只是通过把黄金和其它金属熔合来得到一种白色的外观。白色K金的颜色通常还利用表层镀铑来增强。然而,这种电镀会磨损,从而使其显现出白色K金固有的暗淡黄色。更加直接区别两…

心经略说

心经略说(一),心经名为“般若波罗蜜多心经”,由陈玄奘法师所译。此经为般若部经典之纲要,言简意骇,广为流传。近期我会结合净空法师讲义,进行简要描述,感兴趣的请先熟读愿文。 心经略…

俗语“手握金鱼骨,富贵不用愁”,是啥意思?金鱼骨怎么形成的?

就连《论语颜渊》中也表示:“商闻之矣:死生有命,富贵在天”。由此延伸,古代民间也衍生出很多有关“命相”的东西,比如“生辰八字”,“手相面相”等。 其中有关手相的方面,甚至衍生出不少俗语&a…

手指计数——长在身上的计算机

☞ 欢迎来到神奇的01世界 ☜ 美国人阿西莫夫说过,人类最早的「计算机」是手指,英语单词「Digit」既表示「手指」又表示「整数数字」。 ——孙燕群《计算机史话》 文明始于计数 文明萌芽之前,人类的祖先还没有「数」的概念。在广袤的原始森林…

「蚂蚁金服」AntV年度发布

回顾这一年 去年11月22日 AntV 品牌日,我们一起见证了__「G2」的开源__、「F2」的诞生。此后 AntV 团队从未停止过在数据可视化道路上的探索,得到了开发者们的信任和青睐。 除了收到大量的用户反馈之外,值得一提的是在这一年中,有__ 48 位开发者__为 AntV 贡献代码,非常感…

早上起床黄金10分钟保健

1、两手对搓一分钟——治疗肩痛 手掌快速对搓300次,刺激手掌的经络穴位可通六经、强化内脏、调和阴阳之气。可治疗肩痛、眼睛疲劳。 2、手指摩头一分钟——头发乌黑 手指由前额深摩头顶至脑后,以每秒2-4次的速度,促进脑部血液回流&#xff0c…