GlusterFS 深度洞察：从架构原理到案例实践的全面解读（下）

文章目录

七.Gluster基本命令
八. 客户端挂载访问
九.日常巡检
十.深度优化
十一.常见故障与排查
十二.GlusterFS经典案例
十三.GlusterFS存储卷容灾能力对比图

七.Gluster基本命令

管理glusterd服务
安装GlusterFS后，必须启动Glusterd服务。Glusterd服务充当Gluster卷管理器，
监督glusterfs进程并协调动态卷操作。

以centos7为例演示操作：
手动启动glusterd
# systemctl start glusterd 
glusterd手动停止
# systemctl stop glusterd
自动启动glusterd
# systemctl enable —now glusterd
自动禁用glusterd
# systemctl disable —now glusterd

管理TSP(Trusted Storage Pool)
可信任存储池(TSP)是存储服务器的可信网络。在配置GlusterFS卷之前，您必须创建一个存储服务器的受信任存储池，该存储池将通过对等探测服务器来为卷提供brick。TSP的服务器是互相对等的服务器。

添加服务器要将服务器添加到TSP,请从池中已存在的服务器中对它进行添加#gluster peer probe <server>列出服务器# gluster pool list查看对等状态# gluster peer status删除服务器# gluster peer detach <server>

管理卷
关于卷的类型及创建过程，单独列出了一个章节作为重点介绍，请查看第四章节—卷的类型，在此不赘述

为卷配置传输类型

卷可以支持一种或多种传输类型，用于客户端和brick进程之间的通信。支持的传输类型有tcp,rdma和tcp,rdma三种。

更改卷支持的传输类型，请执行以下过程：

  1）卸载所有客户端上的卷：#umount mount-point2) 使用以下命令停止卷#gluster volume stop <VOLNAME>3)更改传输类型。例如，要同时启用tcp和rdma,请执行following命令：# gluster volume set test-volume config, transport tcp,rdma OR tcp OR rdma4).在所有客户端上安装卷。例如，要使用rdma传输挂载，请使用以下命令：# mount -t glusterfs -o transport=rdma server1:/test-vlume /mnt/glusterfs

扩展卷
您可以在集群在线且可用时根据需要扩展卷。例如，您可能希望向分布式卷添加砖块，从而增加分布并增加 GlusterFS 卷的容量。同样，您可能希望将一组砖块添加到分布式复制卷中，从而增加 GlusterFS 卷的容量。

 注意：在扩展分布式复制卷和分布式分散卷时，您需要添加多个砖块，该砖块数量是副本或分散数量的倍数。例如，要扩展副本数为 2 的分布式复制卷，则需要以 2 的倍数（例如 4、6、8 等）添加砖块。1) 如果服务器还不是TSP的一部分，请使用以下命令探测包含要添加到卷的服务器：#glusterfs peer probe server4
2) 使用以下命令添加brick#gluster volume add-brick test-volume server4:/exp4
3)使用以下命令检查卷信息：#gluster volume info test-volume

收缩卷
您可以在集群在线且可用时根据需要收缩卷。例如，您可能需要移除由于硬件或网络故障而无法在分布式卷中访问的brick。
注意：在收缩分布式复制卷和分布式分散卷时，您需要移除的brick数量必须是副本或条带数的倍数。例如，要收缩副本数为2的分布式复制卷，您需要以2的倍数(例如4，6，8等)移除brick。
使用start选项运行remove-brick将自动触发重新平衡操作，以将数据从removed-bricks迁移到卷的其余部分。

1) 使用以下命令移除brick#gluster volume remove-brick test-volume server2:/exp2 startvolume remove-brick start: success2) 使用以下命令查看删除brick操作的状态#gluster volume remove-brick test-volume status# gluster volume remove-brick test-volume server2:/exp2 statusNode  Rebalanced-files  size  scanned       status---------  ----------------  ----  -------  -----------617c923e-6450-4065-8e33-865e28d9428f               34   340      162   in progress3). 一旦状态显示“完成”，提交remove-brick操作# gluster volume remove-brick test-volume server2:/exp2 commitRemoving brick(s) can result in data loss. Do you want to Continue? (y/n) yvolume remove-brick commit: successCheck the removed bricks to ensure all files are migrated.If files with data are found on the brick path, copy them via a gluster mount point before          re-purposing the removed brick.4).使用以下命令检查卷信息：#gluster volume infoVolume Name: test-volumeType: DistributeStatus: StartedNumber of Bricks: 3Bricks:Brick1: server1:/exp1Brick3: server3:/exp3Brick4: server4:/exp4

停止卷

   使用以下命令停止卷：# gluster volume stop <VOLNAME>例如，要停止测试卷：# gluster volume stop test-volumestopping volume will make its data inaccessible. Do you want to continue? (y/n)输入y以确认操作。该命令的输出显示以下内容：stopping volume test-volume has been successful

删除卷

使用以下命令删除卷：# gluster volume delete <VOLNAME>例如，要删除测试卷：# gluster volume delete test-volumeDeleting volume will erase all information about the volume. Do you want to continue? (y/n)输入y以确认操作。该命令显示以下内容：Deleting volume test-volume has been successful

八. 客户端挂载访问

您可以通过多种方式访问 gluster 卷。您可以使用 Gluster Native Client 方法在 GNU/Linux 客户端中实现高并发、高性能和透明故障转移。您还可以使用 NFS v3 访问 gluster 卷。

Gluster原生客户端
Gluster Native Client 是在用户空间中运行的基于 FUSE 的客户端。当需要高并发和高写入性能时，推荐使用 Gluster Native Client 访问卷的方法。

1).安装Gluster原生客户端在开始安装 Gluster Native Client 之前，您需要验证 FUSE 模块是否已加载到客户端并可以访问所需的模块，如下所示：将 FUSE 可加载内核模块 (LKM) 添加到 Linux 内核：# modprobe fuse验证 FUSE 模块是否已加载：# dmesg | grep -i fuse fuse init (API version 7.13)在客户端安装所需依赖包 yum -y install openssh-server wget fuse fuse-libs  libibverbs关闭防火墙systemctl stop firewalld在centos7发行版安装客户端yum install centos-release-gluster  && yum install glusterfs glusterfs-cli glusterfs-lib glusterfs-fuse -y2).手动挂载卷要挂载卷，执行以下命令# mount -t glusterfs HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR例如：# mount -t glusterfs server1:/test-volume /mnt/glusterfs安装选项mount -t glusterfs使用该命令时可以指定以下选项 。请注意，您需要用逗号分隔所有选项。backupvolfile-server=server-namevolfile-max-fetch-attempts=number of attemptslog-level=loglevellog-file=logfiletransport=transport-typedirect-io-mode=[enable|disable]use-readdirp=[yes|no]例如：# mount -t glusterfs -o backupvolfile-server=server2,server3,server4,use-readdirp=no,volfile-max-fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log server1:/test-volume /mnt/glusterfs如果backupvolfile-server挂载fuse客户端的时候加上option，当server1出现故障时， backupvolfile-serveroption中指定的server可以切换挂载客户端。在volfile-max-fetch-attempts=X选项中，指定在挂载卷时尝试获取卷文件的次数。当您挂载具有多个 IP 地址的服务器或为服务器名称配置循环 DNS 时，此选项很有用。如果use-readdirp设置为 ON，则强制在 fuse 内核模块中使用 readdirp 模式3).自动挂载卷您可以将系统配置为在每次系统启动时自动挂载gluster卷要挂载卷，请编辑 /etc/fstab 文件并添加以下行：HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR glusterfs defaults,_netdev 0 0例如：server1:/test-volume /mnt/glusterfs glusterfs defaults,_netdev 0 0

使用NFS挂载卷
先决条件：在服务器和客户端上安装 nfs-common 软件包（仅适用于基于 Centos/redhat 的发行版），使用以下命令：
$ sudo yum install nfs-common -y

  1).使用 NFS 手动挂载卷要挂载卷，请使用以下命令：# mount -t nfs -o vers=3 HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR例如：# mount -t nfs -o vers=3 server1:/test-volume /mnt/glusterfs笔记Gluster NFS 服务器不支持 UDP。如果您使用的 NFS 客户端默认使用 UDP 连接，则会出现以下消息：requested NFS version or transport protocol is not supported.2).使用 TCP 连接将以下选项添加到 mount 命令：-o mountproto=tcp例如：# mount -o mountproto=tcp -t nfs server1:/test-volume /mnt/glusterfs3).使用 NFS 自动挂载卷您可以将系统配置为在每次系统启动时使用 NFS 自动挂载 Gluster 卷。使用 NFS 自动挂载 Gluster 卷要挂载卷，请编辑 /etc/fstab 文件并添加以下行：HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR nfs defaults,_netdev,vers=3 0 0例如，server1:/test-volume /mnt/glusterfs nfs defaults,_netdev,vers=3 0 0注意：Gluster NFS 服务器不支持 UDP。如果您使用的 NFS 客户端默认使用 UDP 连接，则会出现以下消息：requested NFS version or transport protocol is not supported.4).使用 TCP 连接在 /etc/fstab 文件中添加以下条目：HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR nfs defaults,_netdev,mountproto=tcp 0 0例如：server1:/test-volume /mnt/glusterfs nfs defaults,_netdev,mountproto=tcp 0 0

九.日常巡检

您可以通过不同的参数了解Gluster集群，有助于集群的容量规划和性能调整。使用这些信息，您可以检查和解决问题。

您可以使用volume top 和profile命令查看性能并识别卷的每个brick的瓶颈。这能获取到系统的重要性能信息。

启用profile
首先，您必须启用profile才能查看每个brick的文件操作信息

 #gluster volume profile test-volume start

显示I/O信息

 #gluster volume profile test-volume infoBrick: Test:/export/2Cumulative Stats:Block                     1b+           32b+           64b+Size:Read:                0              0              0Write:             908             28              8Block                   128b+           256b+         512b+Size:Read:                0               6             4Write:               5              23            16Block                  1024b+          2048b+        4096b+Size:Read:                 0              52           17Write:               15             120          846Block                   8192b+         16384b+      32768b+Size:Read:                52               8           34Write:              234             134          286Block                                  65536b+     131072b+Size:Read:                               118          622Write:                             1341          594%-latency  Avg-      Min-       Max-       calls     Foplatency   Latency    Latency___________________________________________________________4.82      1132.28   21.00      800970.00   4575    WRITE5.70       156.47    9.00      665085.00   39163   READDIRP11.35      315.02    9.00     1433947.00   38698   LOOKUP11.88     1729.34   21.00     2569638.00    7382   FXATTROP47.35   104235.02 2485.00     7789367.00     488   FSYNC------------------------------------Duration     : 335BytesRead    : 94505058BytesWritten : 195571980停止分析#gluster volume profile stop例如要在测试卷上停止分析#gluster volume profile test-volume stop

查看打开fd计数和最大fd计数
您可以在brick上查看当前打开fd计数(当前打开次数最多的文件列表和计数)和最大打开fd计数(当前打开的文件计数和打开的最大文件数)，如果不指定brick，则将显示该卷所有brick的打开fd指标
使用以下命令查看打开的fd计数和最大fd计数

 #gluster volume top VOLUME-NAME open [brick BRICK] [list-cnt {0..100}]

例如，要查看test-volume卷 brick: server1/export/dir1 的最大计数和打开计数，列出前10个打开调用

#gluster volume top test-volume open brick server1:/export/dir1 list-cnt 10 Brick: server:/export/dir1Current open fd's: 34 Max open fd's: 209==========Open file stats========open            file namecall count2               /clients/client0/~dmtmp/PARADOX/COURSES.DB11              /clients/client0/~dmtmp/PARADOX/ENROLL.DB11              /clients/client0/~dmtmp/PARADOX/STUDENTS.DB10              /clients/client0/~dmtmp/PWRPNT/TIPS.PPT10              /clients/client0/~dmtmp/PWRPNT/PCBENCHM.PPT9               /clients/client7/~dmtmp/PARADOX/STUDENTS.DB9               /clients/client1/~dmtmp/PARADOX/STUDENTS.DB9               /clients/client2/~dmtmp/PARADOX/STUDENTS.DB9               /clients/client0/~dmtmp/PARADOX/STUDENTS.DB9               /clients/client8/~dmtmp/PARADOX/STUDENTS.DB

查看每个brick的读取性能列表

您可以查看每个砖块上文件的读取吞吐量。如果未指定砖名称，则将显示属于该卷的所有砖的指标。输出将是写入吞吐量。
使用以下命令查看每个brick的读取性能列表：

# volume top <VOLNAME> {open|read|write|opendir|readdir|clear} [nfs|brick <brick>] [list-cnt <value>] | {read-perf|write-perf} [bs <size> count <count>] [brick <brick>] [list-cnt <value>]例如，要查看测试卷的brick服务器：/export/ 上的读取性能，计数 1 的 256 块大小和列表计数 10：
# gluster volume top test-volume read-perf bs 256 count 1 brick master:/bricks/brick1/gv0  list-cnt 10
Brick: master:/bricks/brick1/gv0
Throughput 64.00 MBps time 0.0000 secs
MBps Filename                                        Time
==== ========                                        ====0 /hello.8669                                     2022-06-01 02:43:24 +0000.15930 /hello.5004                                     2022-06-01 02:43:15 +0000.14790 /hello.1326                                     2022-06-01 02:43:06 +0000.44250 /hello.1325                                     2022-06-01 02:43:06 +0000.31470 /hello.1324                                     2022-06-01 02:43:06 +0000.17370 /hello.1323                                     2022-06-01 02:43:06 +0000.980 /hello.1322                                     2022-06-01 02:43:05 +0000.9988530 /hello.1321                                     2022-06-01 02:43:05 +0000.9975310 /hello.1320                                     2022-06-01 02:43:05 +0000.9961950 /hello.132                                      2022-06-01 02:43:05 +0000.994810

查看每个 Brick 上的写入性能列表

您可以查看每个砖块上文件的写入吞吐量列表。如果未指定砖名称，则将显示属于该卷的所有砖的指标。输出将是写入吞吐量。
此命令将为指定的计数和块大小启动 dd 并测量相应的吞吐量。查看每块砖的写入性能列表：
使用以下命令查看每个砖的写入性能列表：

#gluster volume top <VOLNAME> {open|read|write|opendir|readdir|clear} [nfs|brick <brick>] [list-cnt <value>] | {read-perf|write-perf} [bs <size> count <count>] [brick <brick>] [list-cnt <value>]例如，要查看测试卷的砖服务器：/export/ 上的写入性能，计数 1 的 256 块大小和列表计数 10：# gluster volume top test-volume write-perf bs 256 count 1 brick list-cntBrick: server:/export/dir1256 bytes (256 B) copied, Throughput: 2.8 MB/s

显示卷信息

您可以根据需要显示有关特定卷或所有卷的信息
使用以下命令显示有关特定卷的信息：
例如，要显示有关test-volume的信息

#gluster volume info test-volume

使用以下命令显示有关所有卷的信息

 #gluster volume info all

显示卷状态

1).使用以下命令显示有关特定卷的信息：

# gluster volume status [all| []] [detail|clients|mem|inode|fd|callpool]例如，要显示有关 test-volume 的信息：# gluster volume status test-volumeSTATUS OF VOLUME: test-volumeBRICK                           PORT   ONLINE   PID————————————————————————————arch:/export/1                  24009   Y       22445————————————————————————————arch:/export/2                  24010   Y       22450

2).使用以下命令显示所有卷的信息

      # gluster volume status allSTATUS OF VOLUME: volume-testBRICK                           PORT   ONLINE   PID--------------------------------------------------------arch:/export/4                  24010   Y       22455STATUS OF VOLUME: test-volumeBRICK                           PORT   ONLINE   PID--------------------------------------------------------arch:/export/1                  24009   Y       22445--------------------------------------------------------arch:/export/2                  24010   Y       22450

3).使用以下命令显示卷的附加信息

     # gluster volume status test-volume detailsSTATUS OF VOLUME: test-volume-------------------------------------------Brick                : arch:/export/1Port                 : 24009Online               : YPid                  : 16977File System          : rootfsDevice               : rootfsMount Options        : rwDisk Space Free      : 13.8GBTotal Disk Space     : 46.5GBInode Size           : N/AInode Count          : N/AFree Inodes          : N/ANumber of Bricks: 1Bricks:Brick: server:/brick6

4).使用以下命令显示访问卷的客户端列表：

   # gluster volume status test-volume clients例如，要显示连接到 test-volume 的客户端列表：# gluster volume status test-volume clientsBrick : arch:/export/1Clients connected : 2Hostname          Bytes Read   BytesWritten--------          ---------    ------------127.0.0.1:1013    776          676127.0.0.1:1012    50440        51200

5).使用以下命令显示卷的内存使用情况和内存池详细信息：

   # gluster volume status test-volume mem例如，要显示 test-volume 块的内存使用情况和内存池详细信息：Memory status for volume : test-volume----------------------------------------------Brick : arch:/export/1Mallinfo--------Arena    : 434176Ordblks  : 2Smblks   : 0Hblks    : 12Hblkhd   : 40861696Usmblks  : 0Fsmblks  : 0Uordblks : 332416Fordblks : 101760Keepcost : 100400Mempool Stats-------------Name                               HotCount ColdCount PaddedSizeof AllocCount MaxAlloc----                               -------- --------- ------------ ---------- -----test-volume-server:fd_t                0     16384           92         57        5test-volume-server:dentry_t           59       965           84         59       59test-volume-server:inode_t            60       964          148         60       60test-volume-server:rpcsvc_request_t    0       525         6372        351        2glusterfs:struct saved_frame           0      4096          124          2        2glusterfs:struct rpc_req               0      4096         2236          2        2glusterfs:rpcsvc_request_t             1       524         6372          2        1glusterfs:call_stub_t                  0      1024         1220        288        1glusterfs:call_stack_t                 0      8192         2084        290        2glusterfs:call_frame_t                 0     16384          172       1728

6).使用以下命令显示卷的 inode 表：

  # gluster volume status inode例如，要显示测试卷的 inode 表# gluster volume status test-volume inodeinode tables for volume test-volume----------------------------------------------Brick : arch:/export/1Active inodes:GFID                                            Lookups            Ref   IA type----                                            -------            ---   -------6f3fe173-e07a-4209-abb6-484091d75499                  1              9         2370d35d7-657e-44dc-bac4-d6dd800ec3d3                  1              1         2LRU inodes:GFID                                            Lookups            Ref   IA type----                                            -------            ---   -------80f98abe-cdcf-4c1d-b917-ae564cf55763                  1              0         13a58973d-d549-4ea6-9977-9aa218f233de                  1              0         12ce0197d-87a9-451b-9094-9baa38121155                  1              0         2

7).使用以下命令显示卷的打开 fd 表：

     # gluster volume status fd例如，要显示测试卷的打开 fd 表：# gluster volume status test-volume fdFD tables for volume test-volume——————————————————————— Brick : arch:/export/1Connection 1:RefCount = 0  MaxFDs = 128  FirstFree = 4FD Entry            PID                 RefCount            Flags--------            ---                 --------            -----0                   26311               1                   21                   26310               3                   22                   26310               1                   23                   26311               3                   2Connection 2:RefCount = 0  MaxFDs = 128  FirstFree = 0No open fdsConnection 3:RefCount = 0  MaxFDs = 128  FirstFree = 0No open fds8).使用以下命令显示卷的挂起调用：# gluster volume status callpool每个调用都有一个包含调用帧的调用堆栈。例如，要显示 test-volume 的挂起调用：
# gluster volume status test-volume callpool
Pending calls for volume test-volume
----------------------------------------------
Brick : arch:/export/1
Pending calls: 2
Call Stack1UID    : 0GID    : 0PID    : 26338Unique : 192138Frames : 7Frame 1Ref Count   = 1Translator  = test-volume-serverCompleted   = NoFrame 2Ref Count   = 0Translator  = test-volume-posixCompleted   = NoParent      = test-volume-access-controlWind From   = default_fsyncWind To     = FIRST_CHILD(this)->fops->fsyncFrame 3Ref Count   = 1Translator  = test-volume-access-controlCompleted   = NoParent      = repl-locksWind From   = default_fsyncWind To     = FIRST_CHILD(this)->fops->fsyncFrame 4Ref Count   = 1Translator  = test-volume-locksCompleted   = NoParent      = test-volume-io-threadsWind From   = iot_fsync_wrapperWind To     = FIRST_CHILD (this)->fops->fsyncFrame 5Ref Count   = 1Translator  = test-volume-io-threadsCompleted   = NoParent      = test-volume-markerWind From   = default_fsyncWind To     = FIRST_CHILD(this)->fops->fsyncFrame 6Ref Count   = 1Translator  = test-volume-markerCompleted   = NoParent      = /export/1Wind From   = io_stats_fsyncWind To     = FIRST_CHILD(this)->fops->fsyncFrame 7Ref Count   = 1Translator  = /export/1Completed   = NoParent      = test-volume-serverWind From   = server_fsync_resumeWind To     = bound_xl->fops->fsync

十.深度优化

启用元数据缓存
　　元数据缓存提高了几乎所有工作负载的性能，但大多数工作负载从多个客户端同时访问文件的用例除外。执行以下命令启用元数据缓存和缓存失效：
　　# gluster volume set group metadata-cache 该组命令启用文件或目录的stat和xattr信息的缓存。缓存每 10 分钟刷新一次，并启用缓存失效以确保缓存一致性。
A. 要增加可以缓存的文件数，请执行以下命令： # gluster volume set network.inode-lru-limit n，设置为 50000。如果卷中的活动文件数非常多，可以增加它。增加这个数字会增加砖进程的内存占用。
B. 执行以下命令以启用 samba 特定元数据缓存： # gluster volume set cache-samba-metadata on
C. 默认情况下，某些 xattrs 由 gluster 缓存，例如：capability xattrs、ima xattrs ACL 等。如果应用程序使用 Gluster 存储使用任何其他 xattrs，请执行以下命令将这些 xattrs 添加到元数据缓存列表中：

# gluster volume set <volname> xattr-cache-list "comma separated xattr list" 例如：# gluster volume set <volname> xattr-cache-list “user.org.netatalk.*,user.swift.metadata"

目录操作

除了启用元数据缓存外，还可以设置以下选项来提高目录操作的性能：

### 目录列表性能：使能够parallel-readdir # gluster volume set <VOLNAME> performance.readdir-ahead on # gluster volume set 　　<VOLNAME> performance.parallel-readdir on
### 文件/目录创建性能使能够nl-cache # gluster volume set <volname> group nl-cache # gluster volume set <volname> nl-cache-positive-entry on上述命令还启用了缓存失效并将超时时间增加到 10 分钟

小文件读操作

对于主要读取小文件的用例，启用以下选项# gluster volume set <volname> performance.cache-invalidation on# gluster volume set <volname> features.cache-invalidation on# gluster volume set <volname> performance.qr-cache-timeout 600 --> 10 min recommended     setting# gluster volume set <volname> cache-invalidation-timeout 600 --> 10 min recommended setting此命令可以在客户端缓存中缓存小文件的内容。启用缓存失效可确保缓存一致性。可以使用设置总缓存大小# gluster volume set <volname> cache-size <size>默认情况下，<=64KB缓存具有大小的文件。要更改此值：# gluster volume set <volname> performance.cache-max-file-size <size>请注意，size参数使用 SI 单位后缀，例如64KBor 2MB。

write-behind
Write Behind Translator （后写）
gluster volume set tank write-behind on
　　
　　通常情况下，写操作会比读要慢。通过使用"aggregated background write"技术，write-behind translator 相当显著地改善了写的性能。更确切地说，大量小的写操作被集中起来，形成少量的、大一些的写操作，并且进行后台写处理(non-blocking)。后写方式在client端上聚合了写操作，减小了必须传递的网络包数量。在server端，它帮助服务器优化写的磁盘寻道时间。

read-ahead

Read Ahead Translator （预读）

volume set tank read-ahead on

基于预设值，read-ahead会顺序地预取一些块。当你的应用忙于处理一些数据的时候，GlusterFS能够预读下一批等待处理的数据。这样能够使的读取操作更加流畅和迅速。而且，工作起来像一个读的集合器一样（read-aggregator），也就是说，将大量的、零散的读取操作集合成少量的、大一些的读操作，这样，减小了网络和磁盘的负载。page-size 描述了块的大小。page-count 描述了预读块的总数量。

io-cache
gluster volume set tank io-cache on
　　 IO缓存中继(performance/io-cache）属于性能调整中继的一种，作用是缓存住已经被读过的数据，以提高IO性能。
IO缓存中继可以缓存住已经被读过的数据。这个对于多个应用对同一个数据多次访问，并且如果读的操作远远大于写的操作的话是很有用的（比如，IO缓存很适合用于提供web服务的环境，大量的客户端只会进行简单的读取文件的操作，只有很少一部分会去写文件）

quick-read
　　gluster volume set tank quick-read on
从描述上看，该选项只对fuse有用，同时，如果文件的大小大于默认的64k，则该选项也不起作用。

该中继器用来提高小文件读性能。

通过网络对文件系统进行操作开销很大，因此，quick-read使用glusterfs内部get接口来一次执行多个posix系统调用open/read/ close，一次get调用包含：一个open调用 + 多个read调用 + 一个close调用。
　　
open-behind
gluster volume set tank open-behind on
Perform open in the backend only when a necessary FOP arrives (e.g writev on the FD, unlink of the file). When option is disabled, perform backend open right after unwinding open().

io-threads
gluster volume set tank io-thread-count 16
IO线程中继(performance/io-threads）属于性能调整中继的一种，作用是增加IO的并发线程，以提高IO性能。
IO线程中继试图增加服务器后台进程对文件元数据读写I/O的处理能力。由于GlusterFS服务是单线程的，使用IO线程转换器可以较大的提高性能。这个转换器最好是被用于服务器端，而且是在服务器协议转换器后面被加载。
IO线程操作会将读和写操作分成不同的线程。同一时刻存在的总线程是恒定的并且是可以配置的。

十一.常见故障与排查

1. 报错：“Another transaction is in progress for volname” or “Locking failed on xxx.xxx.xxx.xxx”

由于Gluster本质上是分布式的，因此Glusterd在执行操作时会使用锁，以确保对卷所做的配置更改在整个集群中是原子的，以下情况会导致该报错产生：
1) 多个事物争用同一个锁
解决方案：这些可能是暂时的错误，如果在其他事物完成后重试，操作将成功。
2) 其中一个节点上存在过时的锁
解决方案：在清理过期的锁之前，重复该操作将无济于事。重启持有锁的glusterd进程。
a.检查glusterd.log文件以找出哪个节点持有过期的锁。查找消息：lock being held by
b.运行gluster peer status 以在日志消息中识别具有uuid的节点
c.在该节点上重新启动glusterd

2. 报错：“Transport endpoint is not connected” errors but all bricks are up
这通常在brick进程没有完全关闭时发生，在glusterd进程中留下过期的数据。Gluster客户端进程向Glusterd查询侦听的端口并尝试连接到该端口。如果glusterd中的端口信息不正确，则客户端即使已启动也将无法连接到Glusterd.因此产生上面的报错
解决方案：重启Glusterd服务

3. 报错："Peer Rejected”
执行gluster peer status 命令返回“Peer Rejected”
这表明节点上的卷配置与可信任存储池的其余部分不同步。您应该在运行peer status命令的节点的glusterd日志中看到以下消息：
Version of Cksums differ. local cksum = xxxxxx, remote cksum = xxxxyx on peer

解决方案：更新cluster.op-version

 运行gluster volume get all cluster.max-op-version以获取最新支持的操作版本通过执行cluster.op-version更新为最新支持的op-version #gluster volume set all cluster.op-version <op-version>

下面是NFS客户端挂载的常见报错

4. 报错：RPC Error: Program not registered”
当portmap或rpcbind服务未正常启动时会遇到此错误
解决方案：

  # /etc/init.d/portmap start  或# /etc/init.d/rpcbind start

启动 portmap 或 rpcbind 后，需要重新启动 gluster NFS 服务器。

5. 报错：执行mount灵命报““rpc.statd”相关报错
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use ‘-o nolock’ to keep locks local, or start statd.
对于NFS客户端挂载NFS服务端，rpc.statd服务必须在客户端运行，通过一下命令运行rpc.statd
#rpc.statd

GlusterFS_690">十二.GlusterFS经典案例

(1) 扩缩容实战

使用场景
一般情况下，在一个Glusterfs卷中，会尽量让数据在DHT子卷中均衡地分布，但一些特殊情况，比如集群扩容会使文件分布不均衡，进一步发展可能导致旧节点负载过高而出现性能问题，最终影响数据可靠性和可用性。这种情况下就需要手动操作，使数据在不同节点之间尽量均匀分布。
使用数据均衡主要有如下两类使用场景：
A.扩、缩容节点
B.重命名文件
扩容
扩容或缩容按照节点或者子卷为单位，这会使得DHT子卷的数量发生变化，从而导致每个子卷的目录哈希范围改变，进行重新计算和分配，而有些文件的哈希值落到了其他子卷，那么这些文件应该被迁移至正确的子卷。需要手动执行gluster rebalance命令来触发数据均衡功能

扩容后子卷的目录哈希空间分布变化

在这里插入图片描述

缩容
缩容GlusterFS子卷时，则并不需要手动执行命令，缩容时会自动触发执行数据均衡过程，这是因为如果缩容时没有自动进行数据均衡，那么被剔除掉的节点或子卷上的数据将不再可用，从而会导致数据的丢失，这对于用户来说是不可接受的，因此数据均衡在缩容时是不可或缺的，程序实现采用自动触发方式也就理所当然了。

扩容后子卷的目录哈希空间分布变化
在这里插入图片描述

重命名
在GlusterFS中，重命名文件会导致该文件的哈希值发生变化，重命名文件后，系统不回自动进行均衡，而是会在目标子卷上产生一个链接文件，链接文件的扩展属性上会记录文件的实际存储位置，假如此时客户端访问重命名后的文件，会先将请求传到哈希计算得出的子卷去查找该文件，并获取到链接文件信息，DHT模块懂得链接文件的意义，从链接文件信息中得出文件的实际位置，然后再到实际的子卷获取文件。

可以发现，如果重命名文件后不进行数据均衡，则客户应用程序在访问文件时会增加额外的步骤，从而造成一定程度的访问延迟，当系统有大量链接文件时，则会导致访问性能的大幅下降，对应用程序造成影响。而执行数据均衡后则会将文件迁移到正确的位置，消除了链接文件带来的访问延迟问题，因此数据均衡对于文件重命名来说也是很有必要的。

重命名文件导致的文件位置变化

在这里插入图片描述

数据均衡处理流程
在GlusterFS的数据均衡功能实现中，每个节点采用单进程多线程的实现方式，其中，主线程从根目录开始遍历GlusterFS卷在本节点上的目录并修复哈希分布，同时爬取目录下的所有文件，根据算法将相应的文件放到迁移队列中，并通知等待的工作线程进行迁移处理。

工作线程则负责检查迁移队列里是否有文件待迁移，若队列不空则迁移其中的文件，一次迁移一个文件；若队列为空则自我睡眠，等待主线程唤醒。

数据迁移处理流程
在这里插入图片描述
数据均衡的当前工作机制就是多个节点同时参与处理，每个节点以单进程多线程的方式，同时扫描文件和迁移文件，并且扫描文件和迁移文件分别由不同线程处理，每个节点可以并行迁移文件，相比以往的数据均衡流程，大大增加了并行度，并且更加可扩展，使得集群数据均衡时的系统负载分布的更加均匀，同时效率也更高了。

均衡建议
当集群需要进行数据均衡时，建议参考如下内容：
（1）尽量提前做规划，例如，别等到集群存储空间快用完了才扩容，一方面会导致时间紧迫，部署准备时间匆忙，容易忙中出错；另一方面也容易导致旧节点之间的文件迁移失败，最好预留出一定的剩余空间；
（2）确保集群所有节点处于正常状态，卷处于启动状态，glusterd服务进程和brick进程状态正常，节点之间通信正常；
（3）检查GlusterFS卷中是否有文件损坏，如果有，则先对其进行修复；
（4）在执行数据均衡时，确保集群没有自修复操作正在进行，否则会影响到数据正确性和迁移效率；
（5）如果允许的话，在执行数据均衡前，停止客户端应用，可以提高均衡效率；
（6）先执行fix-layout操作，再执行数据迁移，可以在一定程度上提高迁移效率；
（7）根据实际需要选择迁移模式，默认是normal模式，aggressive模式可能占用的系统资源较多，进而影响到存储性能；
（8）数据均衡过程中，通过命令行程序定时关注均衡的当前状态，以便及时发现问题并做相应调整；
（9）当集群规模较大时，可能偶尔会出现某个节点均衡失败的情况，一般重新开始执行均衡即可；
（10）如果执行数据迁移对应用程序影响较大，可以只执行fix layout，这样可以只修复目录的哈希分布，并不会实际迁移文件，此时新文件可以存储到新增节点（或brick）上，之后再找适当时机（系统比较空闲的时候）执行数据迁移操作。
扩缩容实战

(1).扩容过程实例

创建一个分布式卷并开启卷

  [root@master ~]# gluster volume create test1  master:/bricks/brick1/test1 node01:/bricks/brick1/test1volume create: test1: success: please start the volume to access data[root@master ~]# gluster volume  start test1volume start: test1: success

查看卷信息

 [root@master ~]# gluster volume info test1Volume Name: test1
Type: Distribute
Volume ID: 498151f3-5b8c-4f51-bc9f-104ffa60a0ed
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: master:/bricks/brick1/test1
Brick2: node01:/bricks/brick1/test1
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

3.挂载到本地，并向其写入一些文件


[root@master ~]# mount -t glusterfs node01:/test1 /mnt/test
[root@master ~]# for i in `seq -w 1 100`; do cp -rp /var/log/messages /mnt/test/copy-test-$i; donemaster节点
[root@master ~]# ls /bricks/brick1/test1/
copy-test-001  copy-test-019  copy-test-032  copy-test-052  copy-test-079  copy-test-094
copy-test-004  copy-test-021  copy-test-033  copy-test-054  copy-test-081  copy-test-095
copy-test-006  copy-test-022  copy-test-034  copy-test-057  copy-test-082  copy-test-098
copy-test-008  copy-test-023  copy-test-038  copy-test-060  copy-test-083  copy-test-099
copy-test-011  copy-test-024  copy-test-039  copy-test-063  copy-test-086  copy-test-100
copy-test-012  copy-test-028  copy-test-041  copy-test-065  copy-test-087
copy-test-015  copy-test-029  copy-test-046  copy-test-073  copy-test-088
copy-test-016  copy-test-030  copy-test-048  copy-test-077  copy-test-090
copy-test-017  copy-test-031  copy-test-051  copy-test-078  copy-test-093node01节点
[root@node01 ~]# ls /bricks/brick1/test1/
copy-test-002  copy-test-020  copy-test-043  copy-test-058  copy-test-070  copy-test-089
copy-test-003  copy-test-025  copy-test-044  copy-test-059  copy-test-071  copy-test-091
copy-test-005  copy-test-026  copy-test-045  copy-test-061  copy-test-072  copy-test-092
copy-test-007  copy-test-027  copy-test-047  copy-test-062  copy-test-074  copy-test-096
copy-test-009  copy-test-035  copy-test-049  copy-test-064  copy-test-075  copy-test-097
copy-test-010  copy-test-036  copy-test-050  copy-test-066  copy-test-076
copy-test-013  copy-test-037  copy-test-053  copy-test-067  copy-test-080
copy-test-014  copy-test-040  copy-test-055  copy-test-068  copy-test-084
copy-test-018  copy-test-042  copy-test-056  copy-test-069  copy-test-085可以看到，数据分布在master和node01节点

4.扩容节点

[root@master ~]# gluster volume add-brick test1 node02:/bricks/brick1/test1
volume add-brick: success
[root@master ~]# gluster volume info test1Volume Name: test1
Type: Distribute
Volume ID: 498151f3-5b8c-4f51-bc9f-104ffa60a0ed
Status: Started
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: master:/bricks/brick1/test1
Brick2: node01:/bricks/brick1/test1
Brick3: node02:/bricks/brick1/test1
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

5.再向其写入一些文件

[root@master ~]# for i in `seq -w 101 150`; do cp -rp /var/log/messages /mnt/test/copy-test-$i; donemaster节点
[root@master ~]# ls /bricks/brick1/test1/
copy-test-001  copy-test-028  copy-test-054  copy-test-088  copy-test-108  copy-test-135
copy-test-004  copy-test-029  copy-test-057  copy-test-090  copy-test-110  copy-test-136
copy-test-006  copy-test-030  copy-test-060  copy-test-093  copy-test-111  copy-test-138
copy-test-008  copy-test-031  copy-test-063  copy-test-094  copy-test-115  copy-test-142
copy-test-011  copy-test-032  copy-test-065  copy-test-095  copy-test-121  copy-test-143
copy-test-012  copy-test-033  copy-test-073  copy-test-098  copy-test-123  copy-test-145
copy-test-015  copy-test-034  copy-test-077  copy-test-099  copy-test-124  copy-test-147
copy-test-016  copy-test-038  copy-test-078  copy-test-100  copy-test-125  copy-test-148
copy-test-017  copy-test-039  copy-test-079  copy-test-101  copy-test-128  copy-test-150
copy-test-019  copy-test-041  copy-test-081  copy-test-103  copy-test-129
copy-test-021  copy-test-046  copy-test-082  copy-test-104  copy-test-131
copy-test-022  copy-test-048  copy-test-083  copy-test-105  copy-test-132
copy-test-023  copy-test-051  copy-test-086  copy-test-106  copy-test-133
copy-test-024  copy-test-052  copy-test-087  copy-test-107  copy-test-134node01节点
[root@node01 ~]# ls /bricks/brick1/test1/
copy-test-002  copy-test-027  copy-test-053  copy-test-070  copy-test-096  copy-test-122
copy-test-003  copy-test-035  copy-test-055  copy-test-071  copy-test-097  copy-test-126
copy-test-005  copy-test-036  copy-test-056  copy-test-072  copy-test-102  copy-test-127
copy-test-007  copy-test-037  copy-test-058  copy-test-074  copy-test-109  copy-test-130
copy-test-009  copy-test-040  copy-test-059  copy-test-075  copy-test-112  copy-test-137
copy-test-010  copy-test-042  copy-test-061  copy-test-076  copy-test-113  copy-test-139
copy-test-013  copy-test-043  copy-test-062  copy-test-080  copy-test-114  copy-test-140
copy-test-014  copy-test-044  copy-test-064  copy-test-084  copy-test-116  copy-test-141
copy-test-018  copy-test-045  copy-test-066  copy-test-085  copy-test-117  copy-test-144
copy-test-020  copy-test-047  copy-test-067  copy-test-089  copy-test-118  copy-test-146
copy-test-025  copy-test-049  copy-test-068  copy-test-091  copy-test-119  copy-test-149
copy-test-026  copy-test-050  copy-test-069  copy-test-092  copy-test-120node02节点
[root@node02 ~]# ls /bricks/brick1/test1/
您在 /var/spool/mail/root 中有新邮件数据还是落盘到master和node01节点

6.将哈希分布重平衡

[root@node02 ~]# gluster volume rebalance test1 fix-layout start
volume rebalance: test1: success: Rebalance on test1 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: c77d8f6c-d932-4da2-be38-441fdaeecb11
[root@node02 ~]# gluster volume rebalance test1 statusNode                                    status           run time in h:m:s---------                               -----------                ------------master                               fix-layout completed        0:0:0node01                               fix-layout completed        0:0:0localhost                               fix-layout completed        0:0:0
volume rebalance: test1: success再写入一些文件
master节点
[root@master ~]# ls /bricks/brick1/test1/
copy-test-001  copy-test-031  copy-test-077  copy-test-103  copy-test-133  copy-test-169
copy-test-004  copy-test-032  copy-test-078  copy-test-104  copy-test-134  copy-test-171
copy-test-006  copy-test-033  copy-test-079  copy-test-105  copy-test-135  copy-test-172
copy-test-008  copy-test-034  copy-test-081  copy-test-106  copy-test-136  copy-test-179
copy-test-011  copy-test-038  copy-test-082  copy-test-107  copy-test-138  copy-test-180
copy-test-012  copy-test-039  copy-test-083  copy-test-108  copy-test-142  copy-test-190
copy-test-015  copy-test-041  copy-test-086  copy-test-110  copy-test-143  copy-test-191
copy-test-016  copy-test-046  copy-test-087  copy-test-111  copy-test-145  copy-test-192
copy-test-017  copy-test-048  copy-test-088  copy-test-115  copy-test-147  copy-test-196
copy-test-019  copy-test-051  copy-test-090  copy-test-121  copy-test-148  copy-test-197
copy-test-021  copy-test-052  copy-test-093  copy-test-123  copy-test-150  copy-test-198
copy-test-022  copy-test-054  copy-test-094  copy-test-124  copy-test-151  copy-test-199
copy-test-023  copy-test-057  copy-test-095  copy-test-125  copy-test-155  copy-test-200
copy-test-024  copy-test-060  copy-test-098  copy-test-128  copy-test-160
copy-test-028  copy-test-063  copy-test-099  copy-test-129  copy-test-161
copy-test-029  copy-test-065  copy-test-100  copy-test-131  copy-test-166
copy-test-030  copy-test-073  copy-test-101  copy-test-132  copy-test-167node01节点
[root@node01 ~]# ls /bricks/brick1/test1/
copy-test-002  copy-test-036  copy-test-059  copy-test-080  copy-test-117  copy-test-149
copy-test-003  copy-test-037  copy-test-061  copy-test-084  copy-test-118  copy-test-152
copy-test-005  copy-test-040  copy-test-062  copy-test-085  copy-test-119  copy-test-157
copy-test-007  copy-test-042  copy-test-064  copy-test-089  copy-test-120  copy-test-158
copy-test-009  copy-test-043  copy-test-066  copy-test-091  copy-test-122  copy-test-159
copy-test-010  copy-test-044  copy-test-067  copy-test-092  copy-test-126  copy-test-163
copy-test-013  copy-test-045  copy-test-068  copy-test-096  copy-test-127  copy-test-168
copy-test-014  copy-test-047  copy-test-069  copy-test-097  copy-test-130  copy-test-170
copy-test-018  copy-test-049  copy-test-070  copy-test-102  copy-test-137  copy-test-176
copy-test-020  copy-test-050  copy-test-071  copy-test-109  copy-test-139  copy-test-182
copy-test-025  copy-test-053  copy-test-072  copy-test-112  copy-test-140  copy-test-183
copy-test-026  copy-test-055  copy-test-074  copy-test-113  copy-test-141  copy-test-187
copy-test-027  copy-test-056  copy-test-075  copy-test-114  copy-test-144  copy-test-193
copy-test-035  copy-test-058  copy-test-076  copy-test-116  copy-test-146  copy-test-194node02节点
[root@node02 ~]# ls /bricks/brick1/test1/
copy-test-153  copy-test-162  copy-test-173  copy-test-177  copy-test-184  copy-test-188
copy-test-154  copy-test-164  copy-test-174  copy-test-178  copy-test-185  copy-test-189
copy-test-156  copy-test-165  copy-test-175  copy-test-181  copy-test-186  copy-test-195可以看到，哈希分布重新调整后，新节点可以存储文件了，但原有文件还是在老节点上，这样会增加老节点负载

7.数据重平衡

[root@node02 ~]# gluster volume rebalance test1  start
volume rebalance: test1: success: Rebalance on test1 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 4c712f20-3e1b-4d84-9912-a766a00a9bb0
[root@node02 ~]# gluster volume rebalance test1  statusNode Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------master                0        0Bytes            98             0            21            completed        0:00:00node01               18         2.6MB            84             0             0            completed        0:00:00localhost                0        0Bytes            18             0             0            completed        0:00:00
volume rebalance: test1: success[root@master ~]# ls /bricks/brick1/test1/|wc -l
98[root@node01 ~]# ls /bricks/brick1/test1/|wc -l
66[root@node02 ~]# ls /bricks/brick1/test1/|wc -l
57可以看到，数据在三个节点重新平衡迁移了，老的节点负载也优化了

(2)缩容过程实例

缩容过程数据会自动平衡迁移，因此直接操作命令即可

[root@master ~]# gluster volume remove-brick test1 node02:/bricks/brick1/test1 start
It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly.
Do you want to continue with your current cluster.force-migration settings? (y/n) y
volume remove-brick start: success
ID: a2b3c2e8-1bd6-4c23-8c0b-701f05107624
[root@master ~]# gluster volume remove-brick test1 node02:/bricks/brick1/test1 statusNode Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------node02               36         5.1MB            57             0             0            completed        0:00:01[root@master ~]# ls /bricks/brick1/test1/|wc -l
126[root@node01 ~]# ls /bricks/brick1/test1/|wc -l
74可以看到，数据迁移到了master和node01两个节点上再向test1卷写入一些数据
[root@master ~]# for i in `seq -w 201 250`; do cp -rp /var/log/messages /mnt/test/copy-test-$i; done[root@master ~]# ls /bricks/brick1/test1/|wc -l
147[root@node01 ~]# ls /bricks/brick1/test1/|wc -l
103可以看到，新数据写入了node01和master中，不会再向node02写入数据了

(2)解决脑裂问题

脑裂介绍
Glusterfs发现一个文件的多个副本不一致的时候就认为该文件处于脑裂状态。
脑裂分为三种：
数据脑裂：文件中的数据在副本集上不同
元数据脑裂：元数据在不同的节点上不同
入口脑裂：副本中文件的GFID不同或副本的文件类型不同。
脑裂产生原因
a.网络分区。一个集群的多个节点处在不同的网络分区下，不同的分区客户端都会操作该分区下的文件，当网络发生故障恢复后，glusterfs发现一个文件的多个副本是不同的，此时该文件处于脑裂状态
b.glusterfs进程出现问题：好比两个节点的集群server1和server2。(1)server1宕机了，此时文件操作在server2上;(2)server1启动了，server2宕机了，此时文件操作在server1上;(3)server1和server2都启动后，两节点的副本就会不同
脑裂修复方法介绍
#gluster volume heal info
使用以上命令输出所有需要修复的文件列表

#glusterfs volume heal info split-brain
使用以上命令仅输出脑裂中的文件列表

一旦识别出脑裂中的文件，就可以使用各种策略从gluster命令行进行解决修复脑裂
使用gluster cli解决数据/元数据脑裂

  a.选择较大的文件作为源此命令对于每个文件的修复很有用，其中已知/决定将较大大小的文件视为源。
gluster volume heal <VOLNAME> split-brain bigger-file <FILE>
在这里，<FILE>可以是从卷的根目录看到的完整文件名或文件的 GFID 字符串表示，有时会显示在修复信息命令的输出中。执行此命令后，将<FILE>找到包含更大尺寸的副本，并以该brick为源完成修复。b.选择 mtime 最新的文件作为源# gluster volume heal <VOLNAME> split-brain latest-mtime <FILE>c.选择副本中的一块砖作为特定文件的源
gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME> <FILE>d.选择副本的一块砖作为所有文件的源
gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME>

实际修复案例

现有一个test Volume卷，包含两个brick,b1和b2，自愈守护进程关闭

# gluster volume heal test info split-brain# gluster volume heal test info split-brainBrick <hostname:brickpath-b1><gfid:aaca219f-0e25-4576-8689-3bfd93ca70c2><gfid:39f301ae-4038-48c2-a889-7dac143e82dd><gfid:c3c94de2-232d-4083-b534-5da17fc476ac>Number of entries in split-brain: 3Brick <hostname:brickpath-b2>/dir/file1/dir/file4Number of entries in split-brain: 3

可以看出b1中有三个文件处于脑裂状态，b2中有三个文件处于脑裂状态

我们选用较大的文件作为源进行解决脑裂，在修复文件之前，请注意文件大小和md5校验

在b1上[brick1]# stat b1/dir/file1File: ‘b1/dir/file1’Size: 17              Blocks: 16         IO Block: 4096   regular fileDevice: fd03h/64771d    Inode: 919362      Links: 2Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)Access: 2015-03-06 13:55:40.149897333 +0530Modify: 2015-03-06 13:55:37.206880347 +0530Change: 2015-03-06 13:55:37.206880347 +0530Birth: -[brick1]#[brick1]# md5sum b1/dir/file1040751929ceabf77c3c0b3b662f341a8  b1/dir/file1

     在b2上：[brick2]# stat b2/dir/file1File: ‘b2/dir/file1’Size: 13              Blocks: 16         IO Block: 4096   regular fileDevice: fd03h/64771d    Inode: 919365      Links: 2Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)Access: 2015-03-06 13:54:22.974451898 +0530Modify: 2015-03-06 13:52:22.910758923 +0530Change: 2015-03-06 13:52:22.910758923 +0530Birth: -[brick2]#[brick2]# md5sum b2/dir/file1cb11635a45d45668a403145059c2a0d5  b2/dir/file1使用以下命令进行修复#gluster volume heal test split-brain bigger-file /dir/file1修复完成后，两块砖上的 md5sum 和文件大小应该相同。

在 b1 上：[brick1]# stat b1/dir/file1File: ‘b1/dir/file1’Size: 17              Blocks: 16         IO Block: 4096   regular fileDevice: fd03h/64771d    Inode: 919362      Links: 2Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)Access: 2015-03-06 14:17:27.752429505 +0530Modify: 2015-03-06 13:55:37.206880347 +0530Change: 2015-03-06 14:17:12.880343950 +0530Birth: -[brick1]#[brick1]# md5sum b1/dir/file1040751929ceabf77c3c0b3b662f341a8  b1/dir/file1

在砖 b2 上：[brick2]# stat b2/dir/file1File: ‘b2/dir/file1’Size: 17              Blocks: 16         IO Block: 4096   regular fileDevice: fd03h/64771d    Inode: 919365      Links: 2Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)Access: 2015-03-06 14:17:23.249403600 +0530Modify: 2015-03-06 13:55:37.206880000 +0530Change: 2015-03-06 14:17:12.881343955 +0530Birth: -[brick2]#[brick2]# md5sum b2/dir/file1040751929ceabf77c3c0b3b662f341a8  b2/dir/file1