南大通用GBase 8a 通过strace 排查gcadmin 报错原因

GBase 8a数据库集群,通过gcware服务管理一致性,当某个节点gcadmin报错时,该节点的gcluster服务将无法使用。本文通过strace gcadmin 来排查报错的原因。

ECONNREFUSED (Connection refused)

如果corosync服务根本没启动,则会出现如下报错:

[root@localhost ~]# gcadmin
[gcadmin] Could not initialize CRM instance error: [6]->[GC_AIS_ERR_TRY_AGAIN]

跟踪发现报错信息如下,其中在连接时报拒绝Connection refused。

connect(3, {sa_family=AF_LOCAL, sun_path=@"corosync.ipc"}, 110) = -1 ECONNREFUSED (Connection refused)
close(3)                                = 0
close(4)                                = 0
munmap(0, 0)                            = -1 EINVAL (Invalid argument)
munmap(0, 0)                            = -1 EINVAL (Invalid argument)
munmap(0, 0)                            = -1 EINVAL (Invalid argument)
munmap(0, 0)                            = -1 EINVAL (Invalid argument)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f070443b000
write(1, "[gcadmin] Could not initialize C"..., 79[gcadmin] Could not initialize CRM instance error: [6]->[GC_AIS_ERR_TRY_AGAIN]
) = 79
exit_group(-1)                          = ?
+++ exited with 255 +++

查看进程,确实没有。后续请根据corosync.log日志,排查服务没有启动的原因。

[root@localhost ~]# ps -ef|grep corosync
root     24122 23257  0 06:02 pts/3    00:00:00 grep --color=auto corosync
[root@localhost ~]#