基于hadoop 3.1.4
一、准备好需要的文件
1、hadoop-3.1.4编译完成的包
链接: https://pan.baidu.com/s/1tKLDTRcwSnAptjhKZiwAKg 提取码: ekvc
2、需要jdk环境
链接: https://pan.baidu.com/s/18JtAWbVcamd2J_oIeSVzKw 提取码: bmny
3、vmware安装包
链接: https://pan.baidu.com/s/1YxDntBWSCEnN9mTYlH0FUA 提取码: uhsj
4、vmware许可证
链接: https://pan.baidu.com/s/10CsLc-nJXnH5V9IMP-KZeg 提取码: r5y5
5、linux下载
镜像下载地址
二、准备工作
1、安装虚拟机
自行搜索!!!
2、配置静态ip
cd /etc/sysconfig/network-scripts/
机器的ip,网关,子网掩码根据自己机器自行查看
###静态ip配置
IPADDR=192.168.109.103 ##我们需要指定的ip
NETMASK=255.255.255.0
GATEWAY=192.168.109.2
DNS1=8.8.8.8
3、linux安装jdk
卸载linux自带openjdk
rpm -qa|grep jdk.noarch后缀的不要删除
rpm -e --nodeps XXX
tar -zxvf xxx.jarvim /etc/profileexport JAVA_HOME=/usr/local/jdk1.8.0_361
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarsource /etc/profile
4、关闭防火墙
systemctl stop firewalld.service
systemctl diabled firewalld.service
规划
hadoop0 namenode datanode resourcemanager nodemanager
hadoop1 secondarynamenode datanode nodemanager
hadoop2 datanode nodemanager
5、配置hostname以及hosts文件
hostnamectl set-hostname hadoop0
vim /etc/hosts
192.168.109.101 hadoop0
192.168.109.102 hadoop1
192.168.109.103 hadoop2
三台机器都要配置全量的hosts!!!不然后续启动secondarynamenode会失败
6、免密登陆
cd /root/.ssh
如果没有.ssh 则执行 mkdir -p /root/.ssh生成密码
ssh-keygen -t dsa
cd /root/.ssh
cat id_dsa.pub >> authorized_keys
解释一下,先是生成密钥,将密钥放在机器的授权密钥中,其他两台机器重复上面步骤,将hadoop1,hadoop2的id_dsa.pub拷贝至hadoop0的authorized_keys中,这时候hadoop0到hadoop0,hadoop1,hadoop2就能免密登陆
ssh hadoop1
7、统一工作目录
mkdir -p /export/server/
mkdir -p /export/data/
mkdir -p /export/software/
8、hadoop环境变量
vim /etc/profileexport HADOOP_HOME=/export/server/hadoop-3.1.4
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbinsource /etc/profile
9、hadoop配置文件(都在hadoop0上操作,后续会全量拷贝至其他机器)
vim etc/hadoop/hadoop-env.shexport JAVA_HOME=/usr/local/jdk1.8.0_361export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
vim etc/hadoop/core-site.xml<property><name>fs.defaultFS</name><value>hdfs://hadoop0:8020</value>
</property>
<property><name>hadoop.tmp.dir</name><value>/export/data/hadoop-3.1.4</value>
</property>
<property><name>hadoop.http.staticuser.user</name><value>root</value>
</property>
vim etc/hadoop/hdfs-site.xml<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop1:9868</value>
</property>
vim etc/hadoop/mapred-site.xml<property><name>mapreduce.framework.name</name><value>yarn</value>
</property>
<property><name>yarn.app.mapreduce.am.env</name><value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property><property><name>mapreduce.map.env</name><value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property><name>mapreduce.reduce.env</name><value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
vim etc/hadoop/workershadoop0
hadoop1
hadoop2
vim etc/hadoop/yarn-site.xml<property><name>yarn.resourcemanager.hostname</name><value>hadoop0</value>
</property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value>
</property><property><name>yarn.scheduler.minimum-allocation-mb</name><value>512</value>
</property><property><name>yarn.scheduler.maximum-allocation-mb</name><value>2048</value>
</property><property><name>yarn.nodemanager.vmem-pmem-radio</name><value>4</value>
</property>
10、hdfs 初始化(切记不可多次初始化)
hdfs namenode -format
2023-03-26 00:12:47,011 INFO common.Storage: Storage directory /export/data/hadoop-3.1.4/dfs/name has been successfully formatted.total 16
-rw-r--r-- 1 root root 391 Mar 26 00:12 fsimage_0000000000000000000
-rw-r--r-- 1 root root 62 Mar 26 00:12 fsimage_0000000000000000000.md5
-rw-r--r-- 1 root root 2 Mar 26 00:12 seen_txid
-rw-r--r-- 1 root root 220 Mar 26 00:12 VERSION
出现如上文件代表ok!
11、集群启动
根据我们自己规划的机器,分别启动对应进程
hadoop0 namenode datanode resourcemanager nodemanager
hadoop1 secondarynamenode datanode nodemanager
hadoop2 datanode nodemanagerhdfs --daemon start namenode|datanode|secondarynamenode
hdfs --daemon stop namenode|datanode|secondarynamenodeyarn --daemon start resourcemanager|nodemanager
yarn --daemon stop resourcemanager|nodemanager