【PG12.9-repmgr5.1.0-vip-3节点部署】

news/2024/11/23 9:36:03/

PG12.9-Repmgr5.1.0-vip-3节点部署

Centos 7.6
PG 12.9
repmgr 5.1.0


1.系统环境准备

   主机名  	  ip地址    	 角色 	端口 enmo-01	172.20.10.1	 主库 	6000enmo-02	172.20.10.2	 备库 	6000enmo-03	172.20.10.3	 备库 	6000准备将172.20.10.6设置为vip

修改网卡(刷新一下 mac 地址)

需要所有的机器在同一个网段,防止 vip 无法绑定生效
vi /etc/sysconfig/network-scripts/ifcfg-ens33ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=static
IPV6INIT=no
IPADDR=172.20.10.1
NETMASK=255.255.255.0
GATEWAY=172.20.10.10
DNS=8.8.8.8

配置 hosts

 vi /etc/ssh/sshd_configUseDNS 设置为 no  修改主机名 hostnamectl set-hostname enmo-01 hostnamectl set-hostname enmo-02hostnamectl set-hostname enmo-03 修改 hosts 文件 echo "172.20.10.1 enmo-01  172.20.10.2 enmo-02172.20.10.3 enmo-03" >>/etc/hosts 

同步时间

 所有服务器的时区及时间需要保持同步 timedatectl set-timezone Asia/Shanghai 

关闭防火墙和 selinux

systemctl stop firewalld 
systemctl disable firewalld vi /etc/selinux/config 
SELINUX=disabled 

用户配置

创建安装用户
groupadd dba -g 1000 
useradd postgres -g 1000 -u 1000 添加 sudo 权限
启动服务及管理 VIP 使用,权限可细化,忽略此步骤 vip 漂移会失败
visudo -f /etc/sudoers 
postgres ALL=(ALL) NOPASSWD:ALL

reboot now

重启使sudo生效


2.安装主数据库(源码安装)

此处仅主库装数据库,备库需要有数据库软件。

安装依赖

yum install -y  openssl openssl-devel readline readline-devel zlib-devel gcc flex

切换到postgres用户

su - postgres

获取并解压包

wget https://ftp.postgresql.org/pub/source/v12.9/postgresql-12.9.tar.gz  
tar -zxf postgresql-12.9.tar.gz

–进入目录

cd postgresql-12.9

–编译环境检查

./configure --prefix=/home/postgres/soft --with-openssl --with-pgport=6000(最后出现makefile)

–编译安装

make world -j24
make install-world -j24

–初始化数据库数据目录到/home/postgres/data并启用checksum检验

/home/postgres/soft/bin/initdb -D /home/postgres/data -k

–启动数据库

/home/postgres/soft/bin/pg_ctl start -D /home/postgres/data -l /tmp/logfile

–配置环境变量.bashrc

vi .bashrc
export PGPORT=6000
export PGUSER=postgres
export PGHOME=/home/postgres/soft
export PGDATA=/home/postgres/data
export PATH=${PGHOME}/bin:${PATH}
LD_LIBRARY_PATH=$PGHOME/lib:/usr/local/lib:/usr/local/lib64:/usr/lib64:$LD_LIBRARY_PATH

–读取环境变量

source .bashrc

–登录到数据库

psql -d postgres

–创建超级用户 postgres

CREATE USER postgres WITH SUPERUSER PASSWORD '1QAZ@WSX';

–修改端口(如果./configure没指定–with-pgport)

alter system set port=6000;select pg_reload_conf();


3.数据库配置

su - postgres

mkdir -p /home/postgres/archive/wal

cluster_name='enmo_6001'
listen_addresses='0.0.0.0'
port=6000
wal_log_hints=on
logging_collector=on
logging_collector=on
log_filename='pg_log_%u.log'
log_file_mode=0600
log_truncate_on_rotation=on
log_rotation_age=1d
wal_keep_segments=100
archive_mode = on
archive_command = 'cp %p /home/postgres/archive/wal/%f'
wal_level = 'logical'
max_wal_senders = 10
max_replication_slots = 10
hot_standby = on

修改参数,其余参数根据环境进行调整。

重启数据库生效

pg_ctl restart -D $PGDATA -l /tmp/logfile


4.节点间配置互信

–配置集群内postgres用户之间互信

–enmo-01

su - postgres
ssh-keygen -t rsa
cd ~/.ssh
ssh-copy-id -i postgres@enmo-02

–enmo-02

su - postgres
ssh-keygen -t rsa
cd ~/.ssh
cat *.pub >>  authorized_keys
scp authorized_keys postgres@enmo-01:/home/postgres/.ssh

上述过程为配置两节点间的postgres的用户的互信。两两节点之间需要做一次。用date 检验,确保集群内节点能通过postgres用户互相访问。

ssh enmo-01 date;
ssh enmo-02 date;
ssh enmo-03 date;


5.安装repmgr(源码安装)并配置白名单

所有节点都需要安装repmgr工具

–解压压缩包

su - postgres
tar -zxf repmgr-5.1.0.tar.gz
cd repmgr-5.1.0
#环境变量
pg_config验证

–编译安装repmgr

su - postgres
cd /home/postgres/repmgr-5.1.0
./configure
make -j24
make install -j24

–查看版本

/home/postgres/repmgr-5.1.0/repmgr --version 

–安装repmgr扩展

$ psql -d postgres

#主库创建 repmgr 库存储元数据
create user repmgr superuser password '123456';
create database repmgr owner repmgr;#在postgresql.conf的shared_preload_libraries加上repmgr
vim $PGDATA/postgresql.conf
shared_preload_libraries = 'repmgr'\c repmgr  
create extension repmgr; 

–pg_hba配置

node11 节点配置 pg_hba.conf 文件
vi $PGDATA/pg_hba.conflocal repmgr repmgr trust
host repmgr repmgr 172.20.10.1/32 trust
host repmgr repmgr 172.20.10.2/32 trust
host repmgr repmgr 172.20.10.3/32 trust
host repmgr repmgr 172.20.10.6/32 trust
host all all 172.20.10.1/24 scram-sha-256host replication repmgr 172.20.10.1/32 trust
host replication repmgr 172.20.10.2/32 trust
host replication repmgr 172.20.10.3/32 trust
host replication repmgr 172.20.10.6/32 trust

–这边将repmgr之间用trust,其他ip其他用户登录需要用密码。如果想让repmgr在此处也为密码校验,则需要将trust改为md5或者scram-sha-256并在本地创建.pgpass文件,操作如下:

su - postgres
touch ~/.pgpass  
chmod 0600 ~/.pgpass echo "172.20.10.1:6000:repmgr:repmgr:123456 
172.20.10.2:6000:repmgr:repmgr:123456
172.20.10.3:6000:repmgr:repmgr:123456 
172.20.10.4:6000:repmgr:repmgr:123456 ">> ~/.pgpass 

使用密码校验需要注意密码的加密格式,留意pg_authid系统表和password_encryption参数。

–实例检查

只保留主节点运行 pg 实例,如果其他节点有实例运行,停止后删除 pgdata 目录

其他所有节点执行如下操作
su - postgres 
pg_ctl stop -D $PGDATA
rm -rf $PGDARA/*


6.主库repmgr配置及注册

(1)主节点创建repmgr.conf

touch /home/postgres/repmgr.conf 
vi /home/postgres/repmgr.conf node_id=1
node_name='enmo_6001'
conninfo='host=172.20.10.1 port=6000 user=repmgr dbname=repmgr 
connect_timeout=2'
data_directory='/data/pgdata'
replication_user='repmgr'
replication_type='physical'
pg_bindir='/opt/postgresql-12.9/bin'
monitoring_history=yes
monitor_interval_secs=5
log_level='debug'
log_file='/home/postgres/repmgr.log'
failover='automatic'
connection_check_type=ping
reconnect_attempts=3
reconnect_interval=10 
promote_command='/home/postgres/repmgr_promote.sh'
follow_command='/home/postgres/repmgr_follow.sh %n'

上面主要修改的部分为:

node-id  三个节点为1,2,3,用于切换的id
node_name  主要为标识,设置与数据库cluster_name相同
conninfo  设置本地用于连接的repmgr数据库等信息

(2)配置提升为主库需要的 repmgr_promote.sh 脚本

#! /bin/bashecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 2 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.2 "/bin/sudo /usr/sbin/ip addr del 172.10.20.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 2 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 3 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.3 "/bin/sudo /usr/sbin/ip addr del 172.20.10.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 3 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote start" >> /home/postgres/repmgr.log
/home/postgres/repmgr-5.1.0/repmgr standby promote -f /home/postgres/repmgr.conf --log-to-file
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: add VIP start" >> /home/postgres/repmgr.log 
/bin/sudo /usr/sbin/ifconfig ens33:0 172.20.10.6 netmask 255.255.255.0

上边几部分分别为

下掉2,3节点的vip
提升本地数据库角色为主库
在本地挂上vip

(3)配置赋予新主跟随的 repmgr_follow.sh 脚本

#! /bin/bashecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 1 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.1 "/bin/sudo /usr/sbin/ip addr del 172.20.10.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 1 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: follow %n" >> /home/postgres/repmgr.log 
/home/postgres/repmgr-5.1.0/repmgr standby follow -f /home/postgres/repmgr.conf --upstream-node-id=%n --log-to-file

上边两部分为

下掉本地的vip
让本地备机跟随新主

给两个脚本权限

chmod 755 repmgr_promote.sh 
chmod 755 repmgr_follow.sh 

(4)主库注册

/home/postgres/repmgr-5.1.0/repmgr -f /home/postgres/repmgr.conf primary register 

INFO: connecting to primary database…

DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr

host=172.20.10.1 port=6000 fallback_application_name=repmgr"

INFO: “repmgr” extension is already installed

NOTICE: primary node record (ID: 11) registered

查看集群状态,此时集群里只有一个主库

/home/postgres/repmgr-5.1.0/repmgr -f /home/postgres/repmgr.conf cluster show


7.备库repmgr配置及从库注册

上面已经说到所有的节点都需要安装repmgr

如下为2,3节点的配置文件及脚本。

节点2:

(1)主节点创建repmgr.conf

touch /home/postgres/repmgr.conf 
vi /home/postgres/repmgr.conf node_id=2
node_name='enmo_6002'
conninfo='host=172.20.10.2 port=6000 user=repmgr dbname=repmgr 
connect_timeout=2'
data_directory='/data/pgdata'
replication_user='repmgr'
replication_type='physical'
repmgr_bindir='/home/postgres/repmgr-5.1.0'
pg_bindir='/opt/postgresql-12.9/bin'
monitoring_history=yes
monitor_interval_secs=5
log_level='debug'
log_file='/home/postgres/repmgr.log'
failover='automatic'
connection_check_type=ping
reconnect_attempts=3
reconnect_interval=10 
promote_command='/home/postgres/repmgr_promote.sh'
follow_command='/home/postgres/repmgr_follow.sh %n'

(2)配置提升为主库需要的 repmgr_promote.sh 脚本

#! /bin/bashecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 1 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.1 "/bin/sudo /usr/sbin/ip addr del 172.10.20.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 1 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 3 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.3 "/bin/sudo /usr/sbin/ip addr del 172.20.10.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 3 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote start" >> /home/postgres/repmgr.log
/home/postgres/repmgr-5.1.0/repmgr standby promote -f /home/postgres/repmgr.conf --log-to-file
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: add VIP start" >> /home/postgres/repmgr.log 
/bin/sudo /usr/sbin/ifconfig ens33:0 172.20.10.6 netmask 255.255.255.0

(3)配置赋予新主跟随的 repmgr_follow.sh 脚本

#! /bin/bashecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 2 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.2 "/bin/sudo /usr/sbin/ip addr del 172.20.10.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 2 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: follow %n" >> /home/postgres/repmgr.log 
/home/postgres/repmgr-5.1.0/repmgr standby follow -f /home/postgres/repmgr.conf --upstream-node-id=%n --log-to-file

节点3:

(1)主节点创建repmgr.conf

touch /home/postgres/repmgr.conf 
vi /home/postgres/repmgr.conf node_id=3
node_name='enmo_6003'
conninfo='host=172.20.10.3 port=6000 user=repmgr dbname=repmgr 
connect_timeout=2'
data_directory='/data/pgdata'
replication_user='repmgr'
replication_type='physical'
repmgr_bindir='/home/postgres/repmgr-5.1.0'
pg_bindir='/opt/postgresql-12.9/bin'
monitoring_history=yes
monitor_interval_secs=5
log_level='debug'
log_file='/home/postgres/repmgr.log'
failover='automatic'
connection_check_type=ping
reconnect_attempts=3
reconnect_interval=10 
promote_command='/home/postgres/repmgr_promote.sh'
follow_command='/home/postgres/repmgr_follow.sh %n'

(2)配置提升为主库需要的 repmgr_promote.sh 脚本

#! /bin/bashecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 1 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.1 "/bin/sudo /usr/sbin/ip addr del 172.10.20.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 1 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 2 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.2 "/bin/sudo /usr/sbin/ip addr del 172.20.10.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 2 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote start" >> /home/postgres/repmgr.log
/home/postgres/repmgr-5.1.0/repmgr standby promote -f /home/postgres/repmgr.conf --log-to-file
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: add VIP start" >> /home/postgres/repmgr.log 
/bin/sudo /usr/sbin/ifconfig ens33:0 172.20.10.6 netmask 255.255.255.0

(3)配置赋予新主跟随的 repmgr_follow.sh 脚本

#! /bin/bashecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 3 start" >> /home/postgres/repmgr.log
/bin/ssh -t postgres@172.20.10.3 "/bin/sudo /usr/sbin/ip addr del 172.20.10.6/24 dev ens33:0"
echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: del VIP in 3 finish" >> /home/postgres/repmgr.logecho "["`date "+%Y-%m-%d %H:%M:%S"`"]: follow %n" >> /home/postgres/repmgr.log 
/home/postgres/repmgr-5.1.0/repmgr standby follow -f /home/postgres/repmgr.conf --upstream-node-id=%n --log-to-file

给两个脚本权限

chmod 755 repmgr_promote.sh 
chmod 755 repmgr_follow.sh 

(4)两个备库分别从主库clone数据,相当于搭建流复制(要加入集群的备机节点执行,确保本地有数据目录,且数据目录下为空)

/home/postgres/repmgr-5.1.0/repmgr -h 172.10.10.1 -p6000 -U repmgr -d repmgr -f /home/postgres/repmgr.conf standby clone 

(5)修改两个备库的 配置参数

修改 postgresql.conf 
vi $PGDATA/postgresql.conf  
cluster_name='enmo_6002' vi $PGDATA/postgresql.conf  
cluster_name='enmo_6003' 

(6)启动及注册

两个备库分别拉起

pg_ctl start -D $PGDATA

两个备库分别执行,进行注册

/home/postgres/repmgr-5.1.0/repmgr -f /home/postgres/repmgr.conf standby register 


8.所有节点启动守护进程repmgrd

/home/postgres/repmgr-5.1.0/repmgrd -f /home/postgres/repmgr.conf -d

repmgrd守护进程主要做对集群节点的监控


9.流复制状态和集群状态检查

主备库分别查看 wal 进程 主库有 walsender 备库有 walreceiver

ps -ef | grep wal

查看集群状态,所有节点都可以通过repmgr查看集群状态

/home/postgres/repmgr-5.1.0/repmgr -f /home/postgres/repmgr.conf cluster showDEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.2 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.3 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"ID | Name        | Role    | Status    | Upstream    | Location | Priority | Timeline | Connection string
----+-------------+---------+-----------+-------------+----------+----------+----------+------------------------------------------------------------------------1  | enmo_6001 | primary | * running |             | default  | 100      | 1        | host=172.20.10.1 port=6000 user=repmgr dbname=repmgr connect_timeout=22  | enmo_6002 | standby |   running | enmo_6001 | default  | 100      | 1        | host=172.20.10.2 port=6000 user=repmgr dbname=repmgr connect_timeout=23  | enmo_6003 | standby |   running | enmo_6001 | default  | 100      | 1        | host=172.20.10.3 port=6000 user=repmgr dbname=repmgr connect_timeout=2/home/postgres/repmgr-5.1.0/repmgr -f /home/postgres/repmgr.conf service statusDEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.2 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.3 port=6000 fallback_application_name=repmgr"
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=172.20.10.1 port=6000 fallback_application_name=repmgr"ID | Name        | Role    | Status    | Upstream    | repmgrd | PID   | Paused? | Upstream last seen
----+-------------+---------+-----------+-------------+---------+-------+---------+--------------------1  | enmo_6001 | primary | * running |             | running | 15001 | no      | n/a2  | enmo_6002 | standby |   running | enmo_6001 | running | 15311 | no      | 2 second(s) ago3  | enmo_6003 | standby |   running | enmo_6001 | running | 16068 | no      | 4 second(s) ago

主库查看流复制状态

postgres=# select * from pg_stat_replication;pid  | usesysid | usename | application_name | client_addr | client_hostname | client_port |         backend_start         | backend_xmin |   state   |  sent_lsn  | write_lsn  |flush_lsn  | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state |          reply_time-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+--------------+-----------+------------+------------+------------+------------+-----------+-----------+------------+---------------+------------+-------------------------------21781 |    16385 | repmgr  | enmo_6002      | 172.20.10.2 |                 |       37700 | 2021-11-30 11:26:55.584587+08 |              | streaming | 0/25003338 | 0/25003338 |0/25003338 | 0/25003338 |           |           |            |             1 | sync       | 2021-11-30 14:40:23.214424+0821880 |    16385 | repmgr  | enmo_6003      | 172.20.10.3 |                 |       63746 | 2021-11-30 11:27:53.53565+08  |              | streaming | 0/25003338 | 0/25003338 |0/25003338 | 0/25003338 |           |           |            |             0 | async      | 2021-11-30 14:40:20.465075+08(2 rows)


10.手动给主库挂载vip

/bin/sudo /usr/sbin/ifconfig ens33:0 172.20.10.6 netmask 255.255.255.0

搭建完毕!


附录:相关命令

上vip

/bin/sudo /usr/sbin/ifconfig ens33:0 172.20.10.6 netmask 255.255.255.0

下vip

/bin/sudo /usr/sbin/ip addr del 172.20.10.6/24 dev ens33:0

主库注册

repmgr -f /home/postgres/repmgr.conf primary register

备库注册

repmgr -f /home/postgres/repmgr.conf standby register 

备库从主库clone(相当于全量build)

repmgr -h 172.20.10.1 -p6000 -U repmgr -d repmgr -f /home/postgres/repmgr.conf standby clone 

相当于增量build,如果先加上–dry-run,表示先调试,但不执行

repmgr node rejoin -d 'host=172.20.10.1 user=repmgr dbname=repmgr connect_timeout=2' --force-rewind  --verbose

查看集群状态

repmgr -f /home/postgres/repmgr.conf cluster show
repmgr -f /home/postgres/repmgr.conf service status

启动守护进程

repmgrd -f /home/postgres/repmgr.conf -d

switchover(备库执行)

repmgr -f /home/postgres/repmgr.conf standby switchover -U repmgr  --verbose 

http://www.ppmy.cn/news/172741.html

相关文章

Windows+tensorflow+CUDA搭建GPU加速环境

写在前面 在搭建GPU加速环境时遇到了各种各样的坑,在此记录仅供他人参考和自己回顾。虽然网上有各种各样的教程,但是个人感觉讲的不是很明白,在这里想再理理思路。 坦白说,起初连什么是GPU加速都不太清楚,经过一番查阅…

数据中心白皮书 2022东数西算下数据中心高性能计算的六大趋势八大技术

在数字经济时代下,算力正在逐渐成为新的生产力,广泛应用在社会的方方面面,为各行各业的数字化转型提供动力基础。数据中心是算力的物理承载,是数字化发展的关键基础设施。 目前,我国数据中心产业正从高速发展向高质量…

tensorflow-gpu源码安装

Tensorflow安装版本:Tensorflow-gpu 1.6(其他版本类似) Tensorflow安装方法: 源码编译 系统:ubuntu 16.04.4 安装步骤: 一、确定系统、GPU Compute Capability、cuda、cudnn、gcc、python、Bazel 0.9.0等版本要求,参考文献2&a…

Ubuntu18安装nvidia驱动QuadroP600

Ubuntu18.04 安装nvidia驱动QuadroP600 写在前面:第一次安装的时候还算顺利,按照网上的教程,先在bios里面禁用Security boot,而后添加ppa源,然后直接点击software&Update进行安装,直接就成功了。所以也…

ThinkSystem操作系统安装指导 – RedHat Enterprise linux 7.3(RHEL8可参考)

ThinkSystem操作系统安装指导 – RedHat Enterprise linux 7.3 目的步骤1、介质准备,根据用户环境选择2、RAID驱动准备3、UEFI/BIOS配置4、RAID配置5、操作系统安装6、驱动下载 目的 为了帮助大家使用U盘或者光盘进行手动安装Red Hat Enterprise Linux 7.3。 步骤…

Ubuntu18.04 安装NVIDIA驱动+CUDA10+cuDNN7.6

最正确的方法安装NVIDIA驱动CUDA10cuDNN7.6 话不多说直接开始 平台: Ubuntu18.04 显卡 GTX1080TI**2Quadro P6000*2 如果以前已经装有NVIDIA驱动,先卸载 sudo apt-get remove --purge nvidia*一、禁用nouveau sudo vim /etc/modprobe.d/blacklist.c…

TensorFlow-GPU-CUDA无痛安装教程Windows篇

TensorFlow-GPU-CUDA无痛安装教程Windows篇 前言准备工作英伟达官网显卡支持查看自己的电脑显卡 安装TensorFlow查看自己的TensorFlow版本 安装CUDACUDA的下载和安装 安装cuDNN下载并安装cuDNN 配置CUDA系统变量验证CUDA是否安装成功 测试TensorFlow-gpu是否安装成功测试代码1测…

中国大学MOOC 计算机组成原理第1章测试

1 单选(1分) 随着硬件技术的发展,计算机的电子器件推陈出新,各种类型和用途的计算机也是琳琅满目,但所有种类计算机依然具有“存储程序”的特点,最早提出这种概念的是___ 得分/总分 A. 图灵(Alan Mathison Turing&…