MySQL5.7.42高可用MHA搭建及故障切换演示

ops/2024/9/25 21:02:35/

系列文章目录

rpmbuild构建mysql5.7RPM安装包
MySQL基于GTID同步模式搭建主从复制


文章目录

  • 系列文章目录
  • 前言
  • 一、MHA架构介绍
    • 1.MHA的功能
    • 2.MHA组成
    • 3.MHA故障转移过程
    • 4.MHA架构优缺点
  • 二、环境准备
    • 1.服务器免密
    • 2.基于GTID主从复制搭建
    • 3.下载mha组件
  • 三、MHA组件安装
    • 1.安装依赖
    • 2.所有节点安装MHA node
    • 3.管理节点安装MHA manager
    • 4.daemontools工具安装
  • 四、MHA配置
    • 1.主库配置mha用户
    • 2.生成MHA配置文件
    • 3.manager节点拷贝master_ip_failover,master_ip_online_change文件到app1目录下并修改
    • 4.检测主机ssh与数据库复制
    • 5.主库启动VIP
  • 五、手动切换测试
    • 1.online切换测试
      • 1.1.原主库信息查看
      • 1.2.原从库信息查看
      • 1.3.在线切换前确保 mha节点中的 manager是关闭状态
      • 1.4.开始执行在线切换命令
      • 1.5.切换结果检查
    • 2.failover切换测试
      • 2.1.手动启动masterha_manager进程
      • 2.2.执行failover切换测试,通过手动kill主库192.168.56.130的mysql进程来模拟主库挂掉
      • 2.3.此时从库192.168.56.131自动提升为新主库,主库变为192.168.56.131,原主库192.168.56.130起来后,需要手动配置为从库
  • 六、配置MHA守护进程
  • 七、启动mha守护进程
  • 八、启动守护进程后Failover切换测试
    • 1.manager节点上启动守护进程
    • 2.192.168.56.130主库手动kill数据库进程
    • 3.验证
  • 总结


前言

在当今竞争激烈的商业环境中,企业依赖于高效、可靠的数据管理系统来支持其运营和决策。然而,数据库故障或性能瓶颈往往会对业务造成严重影响,导致服务中断、数据丢失,甚至客户流失。为了应对这些挑战,MHA(Master High Availability)架构逐渐成为企业保障数据可用性的关键解决方案。
MHA架构通过实现主数据库的高可用性,确保在发生故障时能够快速自动切换到备份实例,最大限度地减少停机时间。此外,MHA不仅提供故障转移功能,还支持在线备份和数据恢复,帮助企业在突发情况下迅速恢复正常运营。这种灵活性和可靠性使得MHA成为大型分布式系统和高负载应用的理想选择。


提示:以下是本篇文章正文内容,下面案例可供参考

MHA_20">一、MHA架构介绍

MHA是Master High Availability的缩写,它是目前MySQL高可用方面的一个相对成熟的解决方案,其核心是使用perl语言编写的一组脚本,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在0~ 30秒之内自动完成数据库的故障切换操作,并且能在最大程度上保证数据的一致性,以达到真正意义上的高可用。

基于MHA的架构不像MMM那样需要搭建主主复制,只需要搭建基本的主从复制架构即可。因为MHA在主库挂掉时,是在多个从库中选取出一个从库作为新的主库。MHA集群中的各节点彼此之间均需要基于ssh 互信通信,以实现远程控制及数据管理功能
在这里插入图片描述

MHA_26">1.MHA的功能

1、监控Master节点是否可用,当Master不可用时,能在多个Slave中选举出新的Master     
2、提供了主从切换和故障转移功能,MHA会尝试在宕机的Master上保存binlog,在最大程度上保证事务不丢失。但如果是Master所在的服务器已经无法访问,或硬件层面出现了问题,则无法成功保存binlog     
3、MHA可以与半同步复制结合,避免从库之间出现数据不一致的情况     
4、支持MySQL基于GTID和基于日志点的两种复制方式     

MHA_34">2.MHA组成

Manager节点:MHA Manager 可以单独部署在一台独立的机器上,管理多个 master-slave 集群;也可以部署在一台 slave 节点上。MHA Manager 会定时探测集群中的 master 节点。当 master 出现故障时,它可以自动将最新数据的 slave提升为新的 master, 然后将所有其他的 slave 重新指向新的 master。整个故障转移过程对应用程序完全透明
Master节点:主数据库。负责处理读写操作并将更改同步到备数据库。
Slave节点:从数据库。与主数据库保持同步,并在主数据库故障时接管其角色。

MHA_42">3.MHA故障转移过程

 1、尝试使用ssh登录到宕机崩溃的Master节点上保存二进制日志事件(binlog events);     2、从多个Slave中识别含有最新更新的Slave,将其作为备选的Master;     3、然后基于该Slave同步差异的中继日志(relaylog)到其他的Slave上;     4、接着同步从原Master上保存的二进制日志事件(binlog events);     5、将备选的Master提升为新的Master;     6、使其他的Slave连接新的Master进行复制;     7、在新的Master启动vip地址,保证前端请求可以发送到新的Master。 

MHA_52">4.MHA架构优缺点

优点缺点
使用Perl脚本语言开发并且完全开源,开发者可以根据自己的需求进行二次开发MHA默认不提供虚拟IP功能,需要自行准备虚拟ip
能够支持基于GTID和基于日志点的复制模式MHA启动后只会对Master进行监控,不会对Slave进行监控
MHA在进行故障转移时更不易产生数据丢失集群环境需要能够通过 ssh协议免密登录,存在一定的安全隐患
在一个监控节点上可以监控多个Replication集群MHA没有提供对Slave的读负载均衡功能,需要通过第三方工具来实现

二、环境准备

mysql版本5.7.42
mha版本mha4mysql-node-0.58.tar.gz、mha4mysql-manager-0.58.tar.gz、daemontools-0.76
默认以下服务器端口通信正常、防火墙关闭、selinux关闭

ip作用部署
192.168.56.130主库mysql5.7.42、mha-node
192.168.56.131从库mysl5.7.42、mha-node
192.168.56.132mha管理节点mha-node、mha manager、daemontools
192.168.56.133vip/

1.服务器免密

因为从第一部分就已经明确了MHA集群中的各节点彼此之间均需要基于ssh 互信通信,以实现远程控制及数据管理功能,所以先进行免密操作

[root@python1 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -t rsa -N ''    #生成私钥
[root@python1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.56.131 #将生成的公私钥传给131机器
[root@python1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.56.132 #将生成的公私钥传给131机器

131机器如法炮制

[root@python2 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -t rsa -N ''    #生成私钥
[root@python2 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.56.130 #将生成的公私钥传给130机器
[root@python2 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.56.132 #将生成的公私钥传给131机器
[root@python3 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -t rsa -N ''    #生成私钥
[root@python3 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.56.130 #将生成的公私钥传给130机器
[root@python3 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.56.131 #将生成的公私钥传给130机器

免密结果验证
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

至此,第一步免密操作完成

2.基于GTID主从复制搭建

在此不做过多描述,搭建过程可以参考顶部两篇文章链接地址。搭建完成如下图所示

在这里插入图片描述
至此,主从复制搭建完成

3.下载mha组件

mha-manager

https://github.com/yoshinorim/mha4mysql-manager/releases

在这里插入图片描述

mha-node

https://github.com/yoshinorim/mha4mysql-manager/releases

在这里插入图片描述
daemontools工具

https://cr.yp.to/daemontools/install.html

在这里插入图片描述
至此,组件准备完成

MHA_130">三、MHA组件安装

1.安装依赖

三个节点依次安装依赖包

[root@python1 ~]# yum install -y apr bc bison dos2unix expect gcc gcc-c++ iftop libaio libaio-devel libarchive libev lrzsz make MySQL-python nc ncurses-devel numactl numactl-devel perl-Config-IniFiles perl-Config-Tiny perl-DBD-mysql perl-DBI perl-ExtUtils-CBuilder perl-Log-Dispatch perl-Parallel-ForkManager perl-CPAN perl-Time-HiRes perl-ExtUtils-MakeMaker perl-Digest-MD5 perl-TermReadKey pigz python python-devel python-paramiko python-requests python-setuptools python-simplejson rsync sysstat trickle vim zlib zlib-devel mysql-devel

MHA_node_136">2.所有节点安装MHA node

所有mysql主从节点与manager节点均安装MHA node,将软件包mha4mysql-node-0.58.tar.gz上传至/root,并解压缩安装:此处只演示一个节点的安装过程,其余两个节点安装如法炮制即可

[root@python1 ~]# tar xf mha4mysql-node-0.56.tar.gz
[root@python1 ~]# cd mha4mysql-node
[root@python1 mha4mysql-node]# perl Makefile.PL
*** Module::AutoInstall version 1.03
*** Checking for Perl dependencies...
[Core Features]
- DBI        ...loaded. (1.627)
- DBD::mysql ...loaded. (4.023)
*** Module::AutoInstall configuration finished.
Checking if your kit is complete...
Looks good
Writing Makefile for mha4mysql::node
[root@python1 mha4mysql-node]# make
cp lib/MHA/BinlogManager.pm blib/lib/MHA/BinlogManager.pm
cp lib/MHA/BinlogPosFindManager.pm blib/lib/MHA/BinlogPosFindManager.pm
cp lib/MHA/BinlogPosFinderXid.pm blib/lib/MHA/BinlogPosFinderXid.pm
cp lib/MHA/BinlogHeaderParser.pm blib/lib/MHA/BinlogHeaderParser.pm
cp lib/MHA/BinlogPosFinder.pm blib/lib/MHA/BinlogPosFinder.pm
cp lib/MHA/BinlogPosFinderElp.pm blib/lib/MHA/BinlogPosFinderElp.pm
cp lib/MHA/NodeUtil.pm blib/lib/MHA/NodeUtil.pm
cp lib/MHA/SlaveUtil.pm blib/lib/MHA/SlaveUtil.pm
cp lib/MHA/NodeConst.pm blib/lib/MHA/NodeConst.pm
cp bin/filter_mysqlbinlog blib/script/filter_mysqlbinlog
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/filter_mysqlbinlog
cp bin/apply_diff_relay_logs blib/script/apply_diff_relay_logs
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/apply_diff_relay_logs
cp bin/purge_relay_logs blib/script/purge_relay_logs
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/purge_relay_logs
cp bin/save_binary_logs blib/script/save_binary_logs
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/save_binary_logs
Manifying blib/man1/filter_mysqlbinlog.1
Manifying blib/man1/apply_diff_relay_logs.1
Manifying blib/man1/purge_relay_logs.1
Manifying blib/man1/save_binary_logs.1
[root@python1 mha4mysql-node-0.58]# make install
Installing /usr/local/share/perl5/MHA/NodeUtil.pm
Installing /usr/local/share/perl5/MHA/SlaveUtil.pm
Installing /usr/local/share/perl5/MHA/NodeConst.pm
Installing /usr/local/share/man/man1/filter_mysqlbinlog.1
Installing /usr/local/share/man/man1/apply_diff_relay_logs.1
Installing /usr/local/share/man/man1/purge_relay_logs.1
Installing /usr/local/share/man/man1/save_binary_logs.1
Installing /usr/local/bin/filter_mysqlbinlog
Installing /usr/local/bin/apply_diff_relay_logs
Installing /usr/local/bin/purge_relay_logs
Installing /usr/local/bin/save_binary_logs
Appending installation info to /usr/lib64/perl5/perllocal.pod

在这里插入图片描述

命令解释
node组件安装后也会在/usr/local/bin 下面会生成几个脚本(这些工具通常由 MHAManager 的脚本触发,无需人为操作)主要如下:
save_binary_logs    保存和复制 master 的二进制日志
apply_diff_relay_logs   识别差异的中继日志事件并将其差异的事件应用于其他的 slave
filter_mysqlbinlog  去除不必要的 ROLLBACK 事件
purge_relay_logs    清除中继日志(不会阻塞sql线程)

至此,三个节点的mha-node组件安装完成

MHA_manager_204">3.管理节点安装MHA manager

只在132机器上安装

[root@python3 ~]# tar xf mha4mysql-manager-0.58.tar.gz 
[root@python3 ~]# cd mha4mysql-manager-0.58
[root@python3 mha4mysql-manager-0.58]# ls
AUTHORS  bin  COPYING  debian  inc  lib  Makefile.PL  MANIFEST  META.yml  README  rpm  samples  t  tests
[root@python3 mha4mysql-manager-0.56]# perl Makefile.PL
*** Module::AutoInstall version 1.03
*** Checking for Perl dependencies...
[Core Features]
- DBI                   ...loaded. (1.627)
- DBD::mysql            ...loaded. (4.023)
- Time::HiRes           ...loaded. (1.9725)
- Config::Tiny          ...loaded. (2.14)
- Log::Dispatch         ...loaded. (2.41)
- Parallel::ForkManager ...loaded. (1.18)
- MHA::NodeConst        ...loaded. (0.58)
*** Module::AutoInstall configuration finished.
Checking if your kit is complete...
Looks good
Writing Makefile for mha4mysql::manager
[root@python3 mha4mysql-manager-0.56]#  make && make install
cp lib/MHA/ManagerUtil.pm blib/lib/MHA/ManagerUtil.pm
cp lib/MHA/Config.pm blib/lib/MHA/Config.pm
cp lib/MHA/HealthCheck.pm blib/lib/MHA/HealthCheck.pm
cp lib/MHA/ManagerConst.pm blib/lib/MHA/ManagerConst.pm
cp lib/MHA/ServerManager.pm blib/lib/MHA/ServerManager.pm
cp lib/MHA/ManagerAdmin.pm blib/lib/MHA/ManagerAdmin.pm
cp lib/MHA/FileStatus.pm blib/lib/MHA/FileStatus.pm
cp lib/MHA/ManagerAdminWrapper.pm blib/lib/MHA/ManagerAdminWrapper.pm
cp lib/MHA/MasterFailover.pm blib/lib/MHA/MasterFailover.pm
cp lib/MHA/MasterRotate.pm blib/lib/MHA/MasterRotate.pm
cp lib/MHA/MasterMonitor.pm blib/lib/MHA/MasterMonitor.pm
cp lib/MHA/SSHCheck.pm blib/lib/MHA/SSHCheck.pm
cp lib/MHA/Server.pm blib/lib/MHA/Server.pm
cp lib/MHA/DBHelper.pm blib/lib/MHA/DBHelper.pm
cp bin/masterha_stop blib/script/masterha_stop
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_stop
cp bin/masterha_conf_host blib/script/masterha_conf_host
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_conf_host
cp bin/masterha_check_repl blib/script/masterha_check_repl
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_check_repl
cp bin/masterha_check_status blib/script/masterha_check_status
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_check_status
cp bin/masterha_master_monitor blib/script/masterha_master_monitor
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_master_monitor
cp bin/masterha_check_ssh blib/script/masterha_check_ssh
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_check_ssh
cp bin/masterha_master_switch blib/script/masterha_master_switch
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_master_switch
cp bin/masterha_secondary_check blib/script/masterha_secondary_check
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_secondary_check
cp bin/masterha_manager blib/script/masterha_manager
/usr/bin/perl "-Iinc" -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/masterha_manager
Manifying blib/man1/masterha_stop.1
Manifying blib/man1/masterha_conf_host.1
Manifying blib/man1/masterha_check_repl.1
Manifying blib/man1/masterha_check_status.1
Manifying blib/man1/masterha_master_monitor.1
Manifying blib/man1/masterha_check_ssh.1
Manifying blib/man1/masterha_master_switch.1
Manifying blib/man1/masterha_secondary_check.1
Manifying blib/man1/masterha_manager.1
Installing /usr/local/share/perl5/MHA/Config.pm
Installing /usr/local/share/perl5/MHA/HealthCheck.pm
Installing /usr/local/share/perl5/MHA/ManagerConst.pm
Installing /usr/local/share/perl5/MHA/ServerManager.pm
Installing /usr/local/share/perl5/MHA/MasterFailover.pm
Installing /usr/local/share/perl5/MHA/MasterRotate.pm
Installing /usr/local/share/perl5/MHA/MasterMonitor.pm
Installing /usr/local/share/perl5/MHA/Server.pm
Installing /usr/local/share/perl5/MHA/DBHelper.pm
Installing /usr/local/share/man/man1/masterha_stop.1
Installing /usr/local/share/man/man1/masterha_conf_host.1
Installing /usr/local/share/man/man1/masterha_check_repl.1
Installing /usr/local/share/man/man1/masterha_check_status.1
Installing /usr/local/share/man/man1/masterha_master_monitor.1
Installing /usr/local/share/man/man1/masterha_check_ssh.1
Installing /usr/local/share/man/man1/masterha_master_switch.1
Installing /usr/local/share/man/man1/masterha_secondary_check.1
Installing /usr/local/share/man/man1/masterha_manager.1
Installing /usr/local/bin/masterha_stop
Installing /usr/local/bin/masterha_conf_host
Installing /usr/local/bin/masterha_check_repl
Installing /usr/local/bin/masterha_check_status
Installing /usr/local/bin/masterha_master_monitor
Installing /usr/local/bin/masterha_check_ssh
Installing /usr/local/bin/masterha_master_switch
Installing /usr/local/bin/masterha_secondary_check
Installing /usr/local/bin/masterha_manager
Appending installation info to /usr/lib64/perl5/perllocal.pod
[root@python3 scripts]# pwd
/root/mha4mysql-manager-0.56/samples/scripts
[root@python3 scripts]# ls
master_ip_failover  master_ip_online_change  power_manager  send_report
[root@python3 scripts]# cp -rp ./* /usr/local/bin/
[root@python3 bin]# cd /usr/local/bin/
[root@python3 bin]# chown root.root master_ip_*
[root@python3 mha4mysql-manager-0.58]# chmod +x /usr/local/bin/*

在这里插入图片描述

命令解释
masterha_check_ssh  检查 MHA 的 SSH 配置状况
masterha_check_repl 检查 MySQL 主从复制状况
masterha_manger 启动 manager的脚本
masterha_check_status   检测当前 MHA 运行状态
masterha_master_monitor 检测 master 是否宕机
masterha_master_switch  控制故障转移(自动或者手动)
masterha_conf_host  添加或删除配置的 server 信息
masterha_stop   关闭manager
masterha_secondary_check   从远程服务器建立tcp连接
master_ip_failover 自动切换时vip管理的脚本,不是必须
master_ip_online_change 在线切换时vip的管理,不是必须
power_manager 故障发生后关闭主机的脚本,不是必须
send_report 因故障切换后发送报警的脚本,不是必须

4.daemontools工具安装

MHA管理节点上安装deamontools来将mha作为守护进程启动

[root@python3 ~]# tar xf daemontools-0.76.tar.gz
[root@python3 ~]# cd admin/
[root@python3 admin]# ls
daemontools-0.76
[root@python3 admin]# cd daemontools-0.76/
[root@python3 daemontools-0.76]# ls
package  src
[root@python3 daemontools-0.76]# cd package/
[root@python3 package]# ls
boot.inittab  boot.rclocal  commands  compile  files  install  README  run  run.inittab  run.rclocal  sharing  upgrade
[root@python3 package]# pwd
/root/admin/daemontools-0.76/package
[root@python3 package]# cd ..
[root@python3 daemontools-0.76]# ls
package  src
[root@python3 daemontools-0.76]# package/install #执行安装命令

在这里插入图片描述
报错及解决方法

[root@python3 daemontools-0.76]# package/install #执行安装命令报以下错误
./load envdir unix.a byte.a 
/usr/bin/ld: errno: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in envdir.o
/lib64/libc.so.6: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
make: *** [envdir] Error 1
解决方法:
vim /root/admin/daemontools-0.76/src/conf-cc
# 在gcc 一行末尾增加
-include /usr/include/errno.h
# 更改后
gcc -O2 -Wimplicit -Wunused -Wcomment -Wchar-subscripts -Wuninitialized -Wshadow -Wcast-qual -Wcast-align -Wwrite-strings -include /usr/include/errno.h#再次执行即可安装成功

至此,MHA组件安装完毕

MHA_371">四、MHA配置

1.主库配置mha用户

mysql> grant all on *.* to 'mha'@'192.168.56.132' identified by 'jQHB31K18IBKgPsQps1Q';
Query OK, 0 rows affected, 1 warning (0.44 sec)mysql> flush privileges;
Query OK, 0 rows affected (0.18 sec)

MHA_380">2.生成MHA配置文件

在manager管理节点操作

#创建mha实例目录,名为app1
[root@python3 ~]# mkdir -p  /export/servers/mha/manager/app1/

添加以下app.cnf文件至app1目录下

[server default]
manager_workdir=/export/servers/app/mha/manager/app1 #mha实例目录
manager_log=/export/servers/app/mha/manager/app1/app.log
remote_workdir = /export/servers/app/mha/node
user=mha
password=jQHB31K18IBKgPsQps1Q #创建mha用户时设置的密码
repl_user=repl
repl_password=123456 #主从复制用户密码
ssh_user=root
master_binlog_dir=/export/servers/data/my3306/binlog #主库binlog日志位置
# 指定用于二次检查节点状态的脚本
secondary_check_script=/usr/bin/masterha_secondary_check -s 192.168.56.130 -s 192.168.190.131
ping_interval=3
master_ip_failover_script= /export/servers/app/mha/manager/app1/master_ip_failover
master_ip_online_change_script= /export/servers/app/mha/manager/app1/master_ip_online_change#主从信息
[server1]
hostname=192.168.56.130
port=3306
check_repl_delay = 0[server2]
hostname=192.168.56.131
port=3306
candidate_master=1
check_repl_delay = 0

3.manager节点拷贝master_ip_failover,master_ip_online_change文件到app1目录下并修改

因为自带的这两个perl脚本不能直接使用
master_ip_failover

[root@python3 scripts]# cp -rp /usr/local/bin/master_ip_failover /export/servers/mha/manager/app1/
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';use Getopt::Long;
use MHA::DBHelper;my $vip = "192.168.56.133";
my $net_mask = "255.255.255.0";
my $gateway = "192.168.56.2";
my $device = "ens33";
my $key = "0";
my $ssh_start_vip = "/sbin/ifconfig $device:$key $vip netmask $net_mask && /sbin/arping -c 3 -s $vip -I $device $gateway";
my $ssh_stop_vip = "/sbin/ifconfig $device:$key down";
my $exit_code = 0;my ($command,        $ssh_user,         $orig_master_host,$orig_master_ip, $orig_master_port, $new_master_host,$new_master_ip,  $new_master_port,  $new_master_user,$new_master_password
);
GetOptions('command=s'             => \$command,'ssh_user=s'            => \$ssh_user,'orig_master_host=s'    => \$orig_master_host,'orig_master_ip=s'      => \$orig_master_ip,'orig_master_port=i'    => \$orig_master_port,'new_master_host=s'     => \$new_master_host,'new_master_ip=s'       => \$new_master_ip,'new_master_port=i'     => \$new_master_port,'new_master_user=s'     => \$new_master_user,'new_master_password=s' => \$new_master_password,
);exit &main();sub main {if ( $command eq "stop" || $command eq "stopssh" ) {# $orig_master_host, $orig_master_ip, $orig_master_port are passed.# If you manage master ip address at global catalog database,# invalidate orig_master_ip here.my $exit_code = 1;eval {print "\n\n\n***************************************************************\n";print "Disabling the VIP - $vip on old master: $orig_master_host\n";print "***************************************************************\n\n\n\n";&stop_vip();$exit_code = 0;};if ($@) {warn "Got Error: $@\n";exit $exit_code;}exit $exit_code;}elsif ( $command eq "start" ) {# all arguments are passed.# If you manage master ip address at global catalog database,# activate new_master_ip here.# You can also grant write access (create user, set read_only=0, etc) here.my $exit_code = 10;eval {my $new_master_handler = new MHA::DBHelper();# args: hostname, port, user, password, raise_error_or_not$new_master_handler->connect( $new_master_ip, $new_master_port,$new_master_user, $new_master_password, 1 );## Set read_only=0 on the new master$new_master_handler->disable_log_bin_local();print "Set read_only=0 on the new master.\n";$new_master_handler->disable_read_only();## Creating an app user on the new masterprint "Creating app user on the new master..\n";$new_master_handler->enable_log_bin_local();$new_master_handler->disconnect();## Update master ip on the catalog database, etc#FIXME_xxx;print "\n\n\n***************************************************************\n";print "Enabling the VIP - $vip on new master: $new_master_host \n";print "***************************************************************\n\n\n\n";&start_vip();$exit_code = 0;};if ($@) {warn $@;# If you want to continue failover, exit 10.exit $exit_code;}exit $exit_code;}elsif ( $command eq "status" ) {# do nothingexit 0;}else {&usage();exit 1;}
}# A simple system call that enable the VIP on the new master
sub start_vip() {`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}sub usage {print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

master_ip_online_change

[root@python3 app1]# cp -rp /usr/local/bin/master_ip_online_change .
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiRes qw( sleep gettimeofday tv_interval );
use Data::Dumper;my $vip = "192.168.56.133";
my $net_mask = "255.255.255.0";
my $gateway = "192.168.56.2";
my $device = "ens33";
my $key = "0";
my $ssh_start_vip = "/sbin/ifconfig $device:$key $vip netmask $net_mask && /sbin/arping -c 3 -s $vip -I $device $gateway";
my $ssh_stop_vip = "/sbin/ifconfig $device:$key down";
my $exit_code = 0;my $_tstart;
my $_running_interval = 0.1;
my ($command,              $orig_master_is_new_slave, $orig_master_host,$orig_master_ip,       $orig_master_port,         $orig_master_user,$orig_master_password, $orig_master_ssh_user,     $new_master_host,$new_master_ip,        $new_master_port,          $new_master_user,$new_master_password,  $new_master_ssh_user,
);
GetOptions('command=s'                => \$command,'orig_master_is_new_slave' => \$orig_master_is_new_slave,'orig_master_host=s'       => \$orig_master_host,'orig_master_ip=s'         => \$orig_master_ip,'orig_master_port=i'       => \$orig_master_port,'orig_master_user=s'       => \$orig_master_user,'orig_master_password=s'   => \$orig_master_password,'orig_master_ssh_user=s'   => \$orig_master_ssh_user,'new_master_host=s'        => \$new_master_host,'new_master_ip=s'          => \$new_master_ip,'new_master_port=i'        => \$new_master_port,'new_master_user=s'        => \$new_master_user,'new_master_password=s'    => \$new_master_password,'new_master_ssh_user=s'    => \$new_master_ssh_user,
);exit &main();sub current_time_us {my ( $sec, $microsec ) = gettimeofday();my $curdate = localtime($sec);return $curdate . " " . sprintf( "%06d", $microsec );
}sub sleep_until {my $elapsed = tv_interval($_tstart);if ( $_running_interval > $elapsed ) {sleep( $_running_interval - $elapsed );}
}sub get_threads_util {my $dbh                    = shift;my $my_connection_id       = shift;my $running_time_threshold = shift;my $type                   = shift;$running_time_threshold = 0 unless ($running_time_threshold);$type                   = 0 unless ($type);my @threads;my $sth = $dbh->prepare("SHOW PROCESSLIST");$sth->execute();while ( my $ref = $sth->fetchrow_hashref() ) {my $id         = $ref->{Id};my $user       = $ref->{User};my $host       = $ref->{Host};my $command    = $ref->{Command};my $state      = $ref->{State};my $query_time = $ref->{Time};my $info       = $ref->{Info};$info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);next if ( $my_connection_id == $id );next if ( defined($query_time) && $query_time < $running_time_threshold );next if ( defined($command)    && $command eq "Binlog Dump" );next if ( defined($user)       && $user eq "system user" );nextif ( defined($command)&& $command eq "Sleep"&& defined($query_time)&& $query_time >= 1 );if ( $type >= 1 ) {next if ( defined($command) && $command eq "Sleep" );next if ( defined($command) && $command eq "Connect" );}if ( $type >= 2 ) {next if ( defined($info) && $info =~ m/^select/i );next if ( defined($info) && $info =~ m/^show/i );}push @threads, $ref;}return @threads;
}sub main {if ( $command eq "stop" ) {## Gracefully killing connections on the current master# 1. Set read_only= 1 on the new master# 2. DROP USER so that no app user can establish new connections# 3. Set read_only= 1 on the current master# 4. Kill current queries# * Any database access failure will result in script die.my $exit_code = 1;eval {## Setting read_only=1 on the new master (to avoid accident)my $new_master_handler = new MHA::DBHelper();# args: hostname, port, user, password, raise_error(die_on_error)_or_not$new_master_handler->connect( $new_master_ip, $new_master_port,$new_master_user, $new_master_password, 1 );print current_time_us() . " Set read_only on the new master.. ";$new_master_handler->enable_read_only();if ( $new_master_handler->is_read_only() ) {print "ok.\n";}else {die "Failed!\n";}$new_master_handler->disconnect();# Connecting to the orig master, die if any database error happensmy $orig_master_handler = new MHA::DBHelper();$orig_master_handler->connect( $orig_master_ip, $orig_master_port,$orig_master_user, $orig_master_password, 1 );## Drop application user so that nobody can connect. Disabling per-session binlog beforehand$orig_master_handler->disable_log_bin_local();print current_time_us() . " Drpping app user on the orig master..\n";#FIXME_xxx_drop_app_user($orig_master_handler);## Waiting for N * 100 milliseconds so that current connections can exitmy $time_until_read_only = 15;$_tstart = [gettimeofday];my @threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_id} );while ( $time_until_read_only > 0 && $#threads >= 0 ) {if ( $time_until_read_only % 5 == 0 ) {printf"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",current_time_us(), $#threads + 1, $time_until_read_only * 100;if ( $#threads < 5 ) {print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"foreach (@threads);}}sleep_until();$_tstart = [gettimeofday];$time_until_read_only--;@threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_id} );}## Setting read_only=1 on the current master so that nobody(except SUPER) can writeprint current_time_us() . " Set read_only=1 on the orig master.. ";$orig_master_handler->enable_read_only();if ( $orig_master_handler->is_read_only() ) {print "ok.\n";}else {die "Failed!\n";}## Waiting for M * 100 milliseconds so that current update queries can completemy $time_until_kill_threads = 5;@threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_id} );while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {if ( $time_until_kill_threads % 5 == 0 ) {printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",current_time_us(), $#threads + 1, $time_until_kill_threads * 100;if ( $#threads < 5 ) {print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"foreach (@threads);}}sleep_until();$_tstart = [gettimeofday];$time_until_kill_threads--;@threads = get_threads_util( $orig_master_handler->{dbh},$orig_master_handler->{connection_id} );}## Terminating all threadsprint current_time_us() . " Killing all application threads..\n";$orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );print current_time_us() . " done.\n";$orig_master_handler->enable_log_bin_local();$orig_master_handler->disconnect();## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCKprint "\n\n\n***************************************************************\n";print "Disabling the VIP - $vip on old master: $orig_master_host\n";print "***************************************************************\n\n\n\n";&stop_vip();$exit_code = 0;};if ($@) {warn "Got Error: $@\n";exit $exit_code;}exit $exit_code;}elsif ( $command eq "start" ) {my $exit_code = 10;eval {my $new_master_handler = new MHA::DBHelper();# args: hostname, port, user, password, raise_error_or_not$new_master_handler->connect( $new_master_ip, $new_master_port,$new_master_user, $new_master_password, 1 );## Set read_only=0 on the new master$new_master_handler->disable_log_bin_local();print current_time_us() . " Set read_only=0 on the new master.\n";$new_master_handler->disable_read_only();## Creating an app user on the new masterprint current_time_us() . " Creating app user on the new master..\n";#FIXME_xxx_create_app_user($new_master_handler);$new_master_handler->enable_log_bin_local();$new_master_handler->disconnect();## Update master ip on the catalog database, etcprint "\n\n\n***************************************************************\n";print "Enabling the VIP - $vip on new master: $new_master_host \n";print "***************************************************************\n\n\n\n";&start_vip();$exit_code = 0;};if ($@) {warn "Got Error: $@\n";exit $exit_code;}exit $exit_code;}elsif ( $command eq "status" ) {# do nothingexit 0;}else {&usage();exit 1;}
}# A simple system call that enable the VIP on the new master
sub start_vip() {`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}sub usage {print
"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";die;
}

以上两个文件主要修改以下配置内容即可

总结:修改/export/servers/app/mha/manager/app1/master_ip_failover与/export/servers/app/mha/manager/app1/master_ip_online_change中的vip,net_mask子网掩码, gateway网关, device网卡设备, key序号,记下面几个配置为实际值:
my $vip = "192.168.56.133";  
my $net_mask = "255.255.255.0";
my $gateway = "192.168.56.1";
my $device = "ens33";
my $key = "0";

4.检测主机ssh与数据库复制

在manager节点执行。检查ssh是否成功

[root@python3 app1]# masterha_check_ssh --conf=/export/servers/app/mha/manager/app1/app.cnf
Wed Sep 18 23:01:41 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Sep 18 23:01:41 2024 - [info] Reading application default configuration from /export/servers/mha/manager/app1/app.cnf..
Wed Sep 18 23:01:41 2024 - [info] Reading server configuration from /export/servers/mha/manager/app1/app.cnf..
Wed Sep 18 23:01:41 2024 - [info] Starting SSH connection tests..
Wed Sep 18 23:01:43 2024 - [debug] 
Wed Sep 18 23:01:41 2024 - [debug]  Connecting via SSH from root@192.168.56.131(192.168.56.131:22) to root@192.168.56.130(192.168.56.130:22)..
/usr/bin/python: No module named virtualenvwrapper
virtualenvwrapper.sh: There was a problem running the initialization hooks.If Python could not import the module virtualenvwrapper.hook_loader,
check that virtualenvwrapper has been installed for
VIRTUALENVWRAPPER_PYTHON=/usr/bin/python and that PATH is
set properly.
Wed Sep 18 23:01:42 2024 - [debug]   ok.
Wed Sep 18 23:01:43 2024 - [debug] 
Wed Sep 18 23:01:41 2024 - [debug]  Connecting via SSH from root@192.168.56.130(192.168.56.130:22) to root@192.168.56.131(192.168.56.131:22)..
/usr/bin/python: No module named virtualenvwrapper
virtualenvwrapper.sh: There was a problem running the initialization hooks.If Python could not import the module virtualenvwrapper.hook_loader,
check that virtualenvwrapper has been installed for
VIRTUALENVWRAPPER_PYTHON=/usr/bin/python and that PATH is
set properly.
Wed Sep 18 23:01:42 2024 - [debug]   ok.
Wed Sep 18 23:01:43 2024 - [info] All SSH connection tests passed successfully.

在manager节点执行。检查复制是否成功

[root@python3 app1]# masterha_check_repl --conf=/export/servers/app/mha/manager/app1/app.cnf
Wed Sep 18 21:29:52 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Sep 18 21:29:52 2024 - [info] Reading application default configuration from /export/servers/app/mha/manager/app1/app.cnf..
Wed Sep 18 21:29:52 2024 - [info] Reading server configuration from /export/servers/app/mha/manager/app1/app.cnf..
Wed Sep 18 21:29:52 2024 - [info] MHA::MasterMonitor version 0.58.
Wed Sep 18 21:29:56 2024 - [info] GTID failover mode = 1
Wed Sep 18 21:29:56 2024 - [info] Dead Servers:
Wed Sep 18 21:29:56 2024 - [info] Alive Servers:
Wed Sep 18 21:29:56 2024 - [info]   192.168.56.130(192.168.56.130:3306)
Wed Sep 18 21:29:56 2024 - [info]   192.168.56.131(192.168.56.131:3306)
Wed Sep 18 21:29:56 2024 - [info] Alive Slaves:
Wed Sep 18 21:29:56 2024 - [info]   192.168.56.131(192.168.56.131:3306)  Version=5.7.42-log (oldest major version between slaves) log-bin:enabled
Wed Sep 18 21:29:56 2024 - [info]     GTID ON
Wed Sep 18 21:29:56 2024 - [info]     Replicating from 192.168.56.130(192.168.56.130:3306)
Wed Sep 18 21:29:56 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Sep 18 21:29:56 2024 - [info] Current Alive Master: 192.168.56.130(192.168.56.130:3306)
Wed Sep 18 21:29:56 2024 - [info] Checking slave configurations..
Wed Sep 18 21:29:56 2024 - [info] Checking replication filtering settings..
Wed Sep 18 21:29:56 2024 - [info]  binlog_do_db= , binlog_ignore_db= 
Wed Sep 18 21:29:56 2024 - [info]  Replication filtering check ok.
Wed Sep 18 21:29:56 2024 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Wed Sep 18 21:29:56 2024 - [info] Checking SSH publickey authentication settings on the current master..
/usr/bin/python: No module named virtualenvwrapper
virtualenvwrapper.sh: There was a problem running the initialization hooks.If Python could not import the module virtualenvwrapper.hook_loader,
check that virtualenvwrapper has been installed for
VIRTUALENVWRAPPER_PYTHON=/usr/bin/python and that PATH is
set properly.
Wed Sep 18 21:29:56 2024 - [info] HealthCheck: SSH to 192.168.56.130 is reachable.
Wed Sep 18 21:29:56 2024 - [info] 
192.168.56.130(192.168.56.130:3306) (current master)+--192.168.56.131(192.168.56.131:3306)Wed Sep 18 21:29:56 2024 - [info] Checking replication health on 192.168.56.131..
Wed Sep 18 21:29:56 2024 - [info]  ok.
Wed Sep 18 21:29:56 2024 - [info] Checking master_ip_failover_script status:
Wed Sep 18 21:29:56 2024 - [info]   /export/servers/app/mha/manager/app1/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.56.130 --orig_master_ip=192.168.56.130 --orig_master_port=3306 
Wed Sep 18 21:29:56 2024 - [info]  OK.
Wed Sep 18 21:29:56 2024 - [warning] shutdown_script is not defined.
Wed Sep 18 21:29:56 2024 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.

5.主库启动VIP

主库服务器操作

在主库上手动启vip:192.168.56.133,注意将vip换成实际vip,网关换成实际网关:
[root@python1 ~]# /sbin/ifconfig ens33:0 192.168.56.133&&/sbin/arping -I ens33 -c 3 -s 192.168.56.133 192.168.56.255
ARPING 192.168.56.255 from 192.168.56.133 ens33
Sent 3 probes (3 broadcast(s))
Received 0 response(s)执行ifconfig查看vip是否启动
[root@python1 run]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500inet 192.168.56.130  netmask 255.255.255.0  broadcast 192.168.56.255ether 00:50:56:31:e6:79  txqueuelen 1000  (Ethernet)RX packets 177145  bytes 13064094 (12.4 MiB)RX errors 0  dropped 0  overruns 0  frame 0TX packets 879570  bytes 1920161943 (1.7 GiB)TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0ens33:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500inet 192.168.56.133  netmask 255.255.255.0  broadcast 192.168.56.255ether 00:50:56:31:e6:79  txqueuelen 1000  (Ethernet)

在这里插入图片描述
至此,vip开启成功,mha配置完成

五、手动切换测试

1.online切换测试

原主库为192.168.56.130,设置new_master_host=192.168.56.131进行在线切换

1.1.原主库信息查看

mysql> show variables like '%read_only%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_read_only      | OFF   |
| read_only             | OFF   |
| super_read_only       | OFF   |
| transaction_read_only | OFF   |
| tx_read_only          | OFF   |
+-----------------------+-------+
5 rows in set (0.00 sec)mysql> show master status\G;
*************************** 1. row ***************************File: mysql-bin.000010Position: 194Binlog_Do_DB: Binlog_Ignore_DB: 
Executed_Gtid_Set: eb9c1fa5-7421-11ef-a597-00505631e679:1-2005
1 row in set (0.00 sec)ERROR: 
No query specified#vip此时绑定在130主库机器上
[root@python1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft foreverinet6 ::1/128 scope host valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000link/ether 00:50:56:31:e6:79 brd ff:ff:ff:ff:ff:ffinet 192.168.56.130/24 brd 192.168.56.255 scope global noprefixroute ens33valid_lft forever preferred_lft foreverinet 192.168.56.133/24 brd 192.168.56.255 scope global secondary ens33:0valid_lft forever preferred_lft foreverinet6 fe80::250:56ff:fe31:e679/64 scope link valid_lft forever preferred_lft forever

1.2.原从库信息查看

mysql> show variables like '%read_only%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_read_only      | OFF   |
| read_only             | ON    |
| super_read_only       | ON    |
| transaction_read_only | OFF   |
| tx_read_only          | OFF   |
+-----------------------+-------+
5 rows in set (0.00 sec)mysql> show slave status\G;
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.56.130Master_User: replMaster_Port: 3306Connect_Retry: 60Master_Log_File: mysql-bin.000010Read_Master_Log_Pos: 194Relay_Log_File: mysqld-relay-bin.000014Relay_Log_Pos: 367Relay_Master_Log_File: mysql-bin.000010Slave_IO_Running: YesSlave_SQL_Running: Yes........

在这里插入图片描述

1.3.在线切换前确保 mha节点中的 manager是关闭状态

[root@python3 ~]# ps -ef |grep mha
root       9233   9197  0 23:28 pts/0    00:00:00 grep --color=auto mha[root@python3 ~]#  masterha_check_status --conf=/export/servers/app/mha/manager/app1/app.cnf
app1 is stopped(2:NOT_RUNNING).

1.4.开始执行在线切换命令

[root@python3 ~]# masterha_master_switch --conf=/export/servers/app/mha/manager/app1/app.cnf \--master_state=alive --new_master_host=192.168.56.131 --new_master_port=3306 \--orig_master_is_new_slave --interactive=0
Thu Sep 19 23:21:58 2024 - [info] MHA::MasterRotate version 0.56.
Thu Sep 19 23:21:58 2024 - [info] Starting online master switch..
Thu Sep 19 23:21:58 2024 - [info] 
Thu Sep 19 23:21:58 2024 - [info] * Phase 1: Configuration Check Phase..
Thu Sep 19 23:21:58 2024 - [info] 
Thu Sep 19 23:21:58 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Sep 19 23:21:58 2024 - [info] Reading application default configuration from /export/servers/mha/manager/app1/app.cnf..
Thu Sep 19 23:21:58 2024 - [info] Reading server configuration from /export/servers/mha/manager/app1/app.cnf..
Thu Sep 19 23:21:59 2024 - [warning] SQL Thread is stopped(no error) on 192.168.56.130(192.168.56.130:3306)
Thu Sep 19 23:21:59 2024 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.56.130(192.168.56.130:3306)
Thu Sep 19 23:21:59 2024 - [info] Master configurations are as below: 
Master 192.168.56.130(192.168.56.130:3306), replicating from 192.168.56.131(192.168.56.131:3306)
Master 192.168.56.131(192.168.56.131:3306), replicating from 192.168.56.130(192.168.56.130:3306), read-onlyThu Sep 19 23:21:59 2024 - [info] GTID failover mode = 1
Thu Sep 19 23:21:59 2024 - [info] Current Alive Master: 192.168.56.130(192.168.56.130:3306)
Thu Sep 19 23:21:59 2024 - [info] Alive Slaves:
Thu Sep 19 23:21:59 2024 - [info]   192.168.56.131(192.168.56.131:3306)  Version=5.7.42-log (oldest major version between slaves) log-bin:enabled
Thu Sep 19 23:21:59 2024 - [info]     GTID ON
Thu Sep 19 23:21:59 2024 - [info]     Replicating from 192.168.56.130(192.168.56.130:3306)
Thu Sep 19 23:21:59 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 19 23:21:59 2024 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu Sep 19 23:21:59 2024 - [info]  ok.
Thu Sep 19 23:21:59 2024 - [info] Checking MHA is not monitoring or doing failover..
Thu Sep 19 23:21:59 2024 - [info] Checking replication health on 192.168.56.131..
Thu Sep 19 23:21:59 2024 - [info]  ok.
Thu Sep 19 23:21:59 2024 - [info] 192.168.56.131 can be new master.
Thu Sep 19 23:21:59 2024 - [info] 
From:
192.168.56.130(192.168.56.130:3306) (current master)+--192.168.56.131(192.168.56.131:3306)To:
192.168.56.131(192.168.56.131:3306) (new master)+--192.168.56.130(192.168.56.130:3306)
Thu Sep 19 23:21:59 2024 - [info] Checking whether 192.168.56.131(192.168.56.131:3306) is ok for the new master..
Thu Sep 19 23:21:59 2024 - [info]  ok.
Thu Sep 19 23:21:59 2024 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Sep 19 23:21:59 2024 - [info] 
Thu Sep 19 23:21:59 2024 - [info] * Phase 2: Rejecting updates Phase..
Thu Sep 19 23:21:59 2024 - [info] 
Thu Sep 19 23:21:59 2024 - [info] Executing master ip online change script to disable write on the current master:
Thu Sep 19 23:21:59 2024 - [info]   /export/servers/app/mha/manager/app1/master_ip_online_change --command=stop --orig_master_host=192.168.56.130 --orig_master_ip=192.168.56.130 --orig_master_port=3306 --orig_master_user='mha' --orig_master_password='jQHB31K18IBKgPsQps1Q' --new_master_host=192.168.56.131 --new_master_ip=192.168.56.131 --new_master_port=3306 --new_master_user='mha' --new_master_password='jQHB31K18IBKgPsQps1Q' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave
Thu Sep 19 23:22:00 2024 032251 Set read_only on the new master.. ok.
Thu Sep 19 23:22:00 2024 037554 Drpping app user on the orig master..
Thu Sep 19 23:22:00 2024 038610 Waiting all running 1 threads are disconnected.. (max 1500 milliseconds)
{'Time' => '3004','db' => undef,'Id' => '97','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.56.131:42612'}
Thu Sep 19 23:22:00 2024 543946 Waiting all running 1 threads are disconnected.. (max 1000 milliseconds)
{'Time' => '3004','db' => undef,'Id' => '97','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.56.131:42612'}
Thu Sep 19 23:22:01 2024 048719 Waiting all running 1 threads are disconnected.. (max 500 milliseconds)
{'Time' => '3005','db' => undef,'Id' => '97','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.56.131:42612'}
Thu Sep 19 23:22:01 2024 553285 Set read_only=1 on the orig master.. ok.
Thu Sep 19 23:22:01 2024 557665 Waiting all running 1 queries are disconnected.. (max 500 milliseconds)
{'Time' => '3005','db' => undef,'Id' => '97','User' => 'repl','State' => 'Master has sent all binlog to slave; waiting for more updates','Command' => 'Binlog Dump GTID','Info' => undef,'Host' => '192.168.56.131:42612'}
Thu Sep 19 23:22:02 2024 058575 Killing all application threads..
Thu Sep 19 23:22:02 2024 060537 done.***************************************************************
Disabling the VIP - 192.168.56.133 on old master: 192.168.56.130
***************************************************************/usr/bin/python: No module named virtualenvwrapper
virtualenvwrapper.sh: There was a problem running the initialization hooks.If Python could not import the module virtualenvwrapper.hook_loader,
check that virtualenvwrapper has been installed for
VIRTUALENVWRAPPER_PYTHON=/usr/bin/python and that PATH is
set properly.
Thu Sep 19 23:22:02 2024 - [info]  ok.
Thu Sep 19 23:22:02 2024 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu Sep 19 23:22:02 2024 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu Sep 19 23:22:02 2024 - [info]  ok.
Thu Sep 19 23:22:02 2024 - [info] Orig master binlog:pos is mysql-bin.000010:194.
Thu Sep 19 23:22:02 2024 - [info]  Waiting to execute all relay logs on 192.168.56.131(192.168.56.131:3306)..
Thu Sep 19 23:22:02 2024 - [info]  master_pos_wait(mysql-bin.000010:194) completed on 192.168.56.131(192.168.56.131:3306). Executed 0 events.
Thu Sep 19 23:22:02 2024 - [info]   done.
Thu Sep 19 23:22:02 2024 - [info] Getting new master's binlog name and position..
Thu Sep 19 23:22:02 2024 - [info]  mysql-bin.000005:194
Thu Sep 19 23:22:02 2024 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.56.131', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Sep 19 23:22:02 2024 - [info] Executing master ip online change script to allow write on the new master:
Thu Sep 19 23:22:02 2024 - [info]   /export/servers/app/mha/manager/app1/master_ip_online_change --command=start --orig_master_host=192.168.56.130 --orig_master_ip=192.168.56.130 --orig_master_port=3306 --orig_master_user='mha' --orig_master_password='jQHB31K18IBKgPsQps1Q' --new_master_host=192.168.56.131 --new_master_ip=192.168.56.131 --new_master_port=3306 --new_master_user='mha' --new_master_password='jQHB31K18IBKgPsQps1Q' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave
Thu Sep 19 23:22:02 2024 621597 Set read_only=0 on the new master.
Thu Sep 19 23:22:02 2024 623264 Creating app user on the new master..
***************************************************************
Enabling the VIP - 192.168.56.133 on new master: 192.168.56.131 
***************************************************************
Thu Sep 19 23:22:05 2024 - [info]  ok.
Thu Sep 19 23:22:05 2024 - [info] 
Thu Sep 19 23:22:05 2024 - [info] * Switching slaves in parallel..
Thu Sep 19 23:22:05 2024 - [info] 
Thu Sep 19 23:22:05 2024 - [info] Unlocking all tables on the orig master:
Thu Sep 19 23:22:05 2024 - [info] Executing UNLOCK TABLES..
Thu Sep 19 23:22:05 2024 - [info]  ok.
Thu Sep 19 23:22:05 2024 - [info] Starting orig master as a new slave..
Thu Sep 19 23:22:05 2024 - [info]  Resetting slave 192.168.56.130(192.168.56.130:3306) and starting replication from the new master 192.168.56.131(192.168.56.131:3306)..
Thu Sep 19 23:22:05 2024 - [info]  Executed CHANGE MASTER.
Thu Sep 19 23:22:05 2024 - [info]  Slave started.
Thu Sep 19 23:22:05 2024 - [info] All new slave servers switched successfully.
Thu Sep 19 23:22:05 2024 - [info] 
Thu Sep 19 23:22:05 2024 - [info] * Phase 5: New master cleanup phase..
Thu Sep 19 23:22:05 2024 - [info] 
Thu Sep 19 23:22:06 2024 - [info]  192.168.56.131: Resetting slave info succeeded.
Thu Sep 19 23:22:06 2024 - [info] Switching master to 192.168.56.131(192.168.56.131:3306) completed successfully.

当出现Switching master to 192.168.56.131(192.168.56.131:3306) completed successfully.字样代表在线切换成功,接着可以检查新主从库的相关信息

1.5.切换结果检查

检查192.168.56.131是否为新主库

mysql> show slave status\G; #当在192.168.56.131执行查看slave状态信息时,提示为空
Empty set (0.00 sec)ERROR: 
No query specifiedmysql> show master status\G;
*************************** 1. row ***************************File: mysql-bin.000005Position: 194Binlog_Do_DB: Binlog_Ignore_DB: 
Executed_Gtid_Set: eb9c1fa5-7421-11ef-a597-00505631e679:1-2005
1 row in set (0.00 sec)ERROR: 
No query specifiedmysql> show variables like '%read_only%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_read_only      | OFF   |
| read_only             | OFF   |
| super_read_only       | OFF   |
| transaction_read_only | OFF   |
| tx_read_only          | OFF   |
+-----------------------+-------+
5 rows in set (0.00 sec)

同时vip192.168.56.133也漂移到了131机器上
在这里插入图片描述
检查192.168.56.130是否为新从库

mysql> show variables like '%read_only%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_read_only      | OFF   |
| read_only             | ON    |
| super_read_only       | ON    |
| transaction_read_only | OFF   |
| tx_read_only          | OFF   |
+-----------------------+-------+
5 rows in set (0.00 sec)mysql> show slave status\G;
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.56.131Master_User: replMaster_Port: 3306Connect_Retry: 60Master_Log_File: mysql-bin.000005Read_Master_Log_Pos: 194Relay_Log_File: mysqld-relay-bin.000002Relay_Log_Pos: 367Relay_Master_Log_File: mysql-bin.000005Slave_IO_Running: YesSlave_SQL_Running: Yes......................

在这里插入图片描述
此时,在线切换演示完成,同理,也可以将新主库192.168.56.131切换为192.168.56.130,只需要替换--new_master_host=192.168.56.130这个即可,检查切换是否成功也用如上方法即可

2.failover切换测试

2.1.手动启动masterha_manager进程

[root@python3 app1]# masterha_manager --conf=/export/servers/app/mha/manager/app1/app.cnf
Thu Sep 19 11:26:41 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Sep 19 11:26:41 2024 - [info] Reading application default configuration from /export/servers/app/mha/manager/app1/app.cnf..
Thu Sep 19 11:26:41 2024 - [info] Reading server configuration from /export/servers/app/mha/manager/app1/app.cnf..
Thu Sep 19 11:27:25 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Sep 19 11:27:25 2024 - [info] Reading application default configuration from /export/servers/app/mha/manager/app1/app.cnf..
Thu Sep 19 11:27:25 2024 - [info] Reading server configuration from /export/servers/app/mha/manager/app1/app.cnf..

mysql_1221">2.2.执行failover切换测试,通过手动kill主库192.168.56.130的mysql进程来模拟主库挂掉

[root@python1 ~]# kill -9 主库进程pid

2.3.此时从库192.168.56.131自动提升为新主库,主库变为192.168.56.131,原主库192.168.56.130起来后,需要手动配置为从库

#在192.168.56.130机器上操作
mysql> change master to master_host='192.168.56.131', master_port=3306, master_user='repl' , master_password='123456' ,master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.05 sec)mysql> show slave status\G;
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.56.131Master_User: replMaster_Port: 3306Connect_Retry: 60Master_Log_File: mysql-bin.000002Read_Master_Log_Pos: 194Relay_Log_File: mysqld-relay-bin.000002Relay_Log_Pos: 367Relay_Master_Log_File: mysql-bin.000002Slave_IO_Running: YesSlave_SQL_Running: Yes

注意: 发生一次切换后masterha_manager进程自动停止。此时去manager节点查看manager进程,发现不存在

[root@python3 app1]# ps -ef |grep manager
root       9821   3952  0 11:29 pts/0    00:00:00 grep --color=auto manager

MHA_1251">六、配置MHA守护进程

创建实例app1的守护进程文件,即/export/servers/mha/manager/app1/run文件

[root@python3 ~]# cat >>/export/servers/app/mha/manager/app1/run <<EOF
> #!/bin/sh
> exec masterha_manager --conf=/export/servers/app/mha/manager/app1/app.cnf --wait_on_monitor_error=60 --wait_on_failover_error=60 >> /export/servers/app/mha/manager/app1/app.log 2>&1
> EOF

给run文件授权

[root@python3 ~]# chmod 755 /export/servers/app/mha/manager/app1/run

七、启动mha守护进程

配置完成后即可使用daemontool来启动/停止mha监控进程

启动mha监控:
[root@python3 ~]# svc -u /export/servers/app/mha/manager/app1/
svc: warning: unable to control /export/servers/app/mha/manager/app1/: file does not exist
启动的时候如果报下面的错误:
svc: warning: unable to control /export/servers/app/mha/manager/app1/: file does not exist按照下面解决:
[root@python3 app1]# nohup supervise /export/servers/app/mha/manager/app1/ > svsrun.log 2>&1 &
直接使用上面的supervise启动,然后会自动生成supervise目录,我们就可以使用svc命令了停止mha监控
[root@python3 ~]# svc -d /export/servers/mha/manager/app1/
[root@python3 app1]# ps -ef |grep mha
root      10525  10419  0 11:51 pts/0    00:00:00 supervise /export/servers/app/mha/manager/app1/
root      10526  10525  4 11:51 pts/0    00:00:00 perl /usr/local/bin/masterha_manager --conf=/export/servers/app/mha/manager/app1/app.cnf --wait_on_monitor_error=60 --wait_on_failover_error=60[root@python3 app1]# svc -d /export/servers/app/m
mha/          mysql-5.7.42/ 
[root@python3 app1]# svc -d /export/servers/mha/manager/app1/ #停止守护进程
[root@python3 app1]# ps -ef |grep mha
root      10525  10419  0 11:51 pts/0    00:00:00 supervise /export/servers/mha/manager/app1/
root      10629  10419  0 11:53 pts/0    00:00:00 grep --color=auto mha

八、启动守护进程后Failover切换测试

mha监控进程作为守护进程启动后,我们需要测试mysql故障后是否能够自动切换,这里我们还是手动杀掉192.168.56.131主库上的mysql进程,此时从库192.168.56.130自动提升为新主库,主库变为192.168.56.130,自动切换成功,待原主库192.168.56.131起来后,需要手动重新配置为从库

1.manager节点上启动守护进程

[root@python3 ~]# svc -u /export/servers/app/mha/manager/app1/
[root@python3 ~]# ps -ef |grep mha
root       9692   9691  1 23:44 pts/0    00:00:00 perl /usr/local/bin/masterha_manager --conf=/export/servers/app/mha/manager/app1/app.cnf --wait_on_monitor_error=60 --wait_on_failover_error=60[root@python3 app1]# masterha_check_status --conf=/export/servers/app/mha/manager/app1/app.cnf
app (pid:10712) is running(0:PING_OK), master:192.168.56.130

2.192.168.56.130主库手动kill数据库进程

[root@python1 ~]# ps -ef  |grep mysql
root       4028      1  0 22:09 pts/0    00:00:00 /bin/sh /export/servers/app/mysql-5.7.42/bin/mysqld_safe --defaults-file=/export/servers/data/my3306/my.cnf --user=mysql
mysql      5483   4028  0 22:10 pts/0    00:00:14 /export/servers/app/mysql-5.7.42/bin/mysqld --defaults-file=/export/servers/data/my3306/my.cnf --basedir=/export/servers/app/mysql-5.7.42 --datadir=/export/servers/data/my3306/data --plugin-dir=/export/servers/app/mysql-5.7.42/lib/plugin --user=mysql --log-error=/export/servers/data/my3306/log/mysqld-err.log --open-files-limit=655340 --pid-file=/export/servers/data/my3306/run/mysqld.pid --socket=/export/servers/data/my3306/run/mysqld.sock --port=3306
[root@python1 mha4mysql-node-0.56]# kill -9 4028  5483  

3.验证

kill原主库192.168.56.130mysql进程后,vip漂移到了192.168.56.131机器上,如下所示
在这里插入图片描述
在这里插入图片描述
原主库192.168.56.130起来后,需要手动重新配置为从库

[root@python1 ~]# mysql -uroot -p -S /export/servers/data/my3306/run/mysqld.sock
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.42-log Source distributionCopyright (c) 2000, 2023, Oracle and/or its affiliates.Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql> change master to master_host='192.168.56.131', master_port=3306,master_user='repl',master_password='123456' ,master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.05 sec)mysql> start slave;
Query OK, 0 rows affected (0.05 sec)mysql> show slave status\G;
*************************** 1. row ***************************Slave_IO_State: Master_Host: 192.168.56.131Master_User: replMaster_Port: 3306Connect_Retry: 60Master_Log_File: Read_Master_Log_Pos: 4Relay_Log_File: mysqld-relay-bin.000001Relay_Log_Pos: 4Relay_Master_Log_File: Slave_IO_Running: YesSlave_SQL_Running: Yes
也可以观察mha日志/export/servers/app/mha/manager/app1/app.log是否有错误信息。
因为这个app.log日志在上述第六部分配置MHA守护进程进行了配置,也可以查看日志

在这里插入图片描述


总结

以上就是要分享的内容,实现了一主一从一MHA的搭建过程,并实现了在线主从切换测试、主库宕机离线测试等功能,它还有具体的命令使用方法,等后续完善!!!


http://www.ppmy.cn/ops/115946.html

相关文章

TFTP协议

目录 一、TFTP协议概述 1.1 TFTP协议简介 1.2 TFTP协议特点 二、TFTP协议原理 2.1 TFTP协议工作流程 2.2 TFTP协议数据包格式 三、TFTP协议应用场景 3.1 网络设备配置文件传输 3.2 虚拟机镜像文件传输 3.3 IoT设备固件升级 四、TFTP协议优化方法 4.1 增加超时重传机…

大佬,简单解释下“嵌入式软件开发”和“嵌入式硬件开发”的区别

在开始前刚好我有一些资料&#xff0c;是我根据网友给的问题精心整理了一份「嵌入式的资料从专业入门到高级教程」&#xff0c; 点个关注在评论区回复“888”之后私信回复“888”&#xff0c;全部无偿共享给大家&#xff01;&#xff01;&#xff01;首先&#xff0c;嵌入式硬…

Vue 中 watch 的使用方法及注意事项

前言 Vue 的 Watch 是一个非常有用的功能&#xff0c;它能够监听 Vue 实例数据的变化并执行相应的操作。本篇文章将详细介绍 Vue Watch 的使用方法和注意事项&#xff0c;让你能够充分利用 Watch 来解决 Vue 开发中的各种问题。 1. Watch 是什么&#xff1f; 1.1 Watch 的作…

VScode配置连接远程服务器configure ssh Hosts

VScode配置连接远程服务器&#xff0c;具体步骤 一、点击VScode左下脚这两个∟的按钮 二、点击完上面的按钮后&#xff0c;出现如下的下拉选项&#xff0c;选择“Connect to Host” 三、选择“Connect to Host”后&#xff0c;下拉选项会更新&#xff0c;选择“Configure SSH …

nginx_单机平滑升级

#!/bin/bash# 定义要下载的 Nginx 源码包的 URL 和保存路径 nginx_tar"http://nginx.org/download/nginx-1.19.0.tar.gz" nginx_tar_file"/tmp/nginx-1.19.0.tar.gz" nginx_version"nginx-1.19.0" nginx_path$(which nginx) # 获取 Nginx 的路径…

Android Manifest权限清单

Android权限部分可分为安装权限、运行时权限、特殊权限。 其中安装权限分普通权限和签名权限&#xff1a;普通权限安装后就有&#xff0c;无需重新授权&#xff1b;签名权限则需要系统签名才有的权限&#xff1b; 特殊权限则需要打开指定的系统页面进行授权&#xff0c;当然使用…

毕业设计选题:基于ssm+vue+uniapp的面向企事业单位的项目申报小程序

开发语言&#xff1a;Java框架&#xff1a;ssmuniappJDK版本&#xff1a;JDK1.8服务器&#xff1a;tomcat7数据库&#xff1a;mysql 5.7&#xff08;一定要5.7版本&#xff09;数据库工具&#xff1a;Navicat11开发软件&#xff1a;eclipse/myeclipse/ideaMaven包&#xff1a;M…

python爬虫:从12306网站获取火车站信息

代码逻辑 初始化 (init 方法)&#xff1a; 设置请求头信息。设置车站版本号。 同步车站信息 (synchronization 方法)&#xff1a; 发送GET请求获取车站信息。返回服务器响应的文本。 提取信息 (extract 方法)&#xff1a; 从服务器响应中提取车站信息字符串。去掉字符串末尾的…