南大通用GBase 8a在有节点离线OFFLINE时,不可以扩容

GBase 8a在进行扩容操作时,会检查所有节点的状态,如果节点离线(OFFLINE),因无法登录,会有Check login 类错误,无法进行扩容操作。

环境

2节点集群,其中1个节点出现不可恢复故障,已经离线。 网络也ping不通了。

[gbase@localhost gcinstall]$ gcadmin
CLUSTER STATE:  ACTIVE
CLUSTER MODE:   NORMAL

=================================================================
|             GBASE COORDINATOR CLUSTER INFORMATION             |
=================================================================
|   NodeName   |     IpAddress     |gcware |gcluster |DataState |
-----------------------------------------------------------------
| coordinator1 |    10.0.2.201     | OPEN  |  OPEN   |    0     |
-----------------------------------------------------------------
====================================================================
|                  GBASE DATA CLUSTER INFORMATION                  |
====================================================================
|NodeName |     IpAddress     |    gnode    |syncserver |DataState |
--------------------------------------------------------------------
|  node1  |    10.0.2.201     |    OPEN     |   OPEN    |    0     |
--------------------------------------------------------------------
|  node2  |    10.0.2.202     | OFFLINE     |          |
--------------------------------------------------------------------
[gbase@localhost gcinstall]$

[gbase@localhost gcinstall_8624335.5p4]$ ping 10.0.2.202
PING 10.0.2.202 (10.0.2.202) 56(84) bytes of data.
From 10.0.2.201 icmp_seq=4 Destination Host Unreachable
From 10.0.2.201 icmp_seq=5 Destination Host Unreachable
From 10.0.2.201 icmp_seq=6 Destination Host Unreachable
From 10.0.2.201 icmp_seq=7 Destination Host Unreachable
^C
--- 10.0.2.202 ping statistics ---
8 packets transmitted, 0 received, +4 errors, 100% packet loss, time 7545ms
pipe 4
[gbase@localhost gcinstall_8624335.5p4

报错样例

Check loginUser:root password failed,nodes are:10.0.2.202

以及

Check root password failed,nodes are:10.0.2.202

看后台日志 gcinstall.log

fail to login root@10.0.2.202,err:ssh: connect to host 10.0.2.202 port 22: No route to host

解决方案

先恢复故障节点的操作系统,确保IP能连通。

如果确实节点无法修复,可以联系GBase 8a支持人员,临时调整代码,去掉对故障节点无法登录的检测,强行做扩容操作。