可以提供下你的系统版本,ha版本和hmc版本吗?最好能有详细的log,errpt -a。cluster.log
# lsvg -l sppcivg
sppcivg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
SPPextrans jfs2 30 30 1 open/syncd /export/usr/sap/trans
SPPexsapmnt jfs2 20 20 1 open/syncd /export/sapmnt/SPP
SPPSCS jfs2 10 10 1 open/syncd /usr/sap/SPP/SCS00
loglv01 jfs2log 1 1 1 open/syncd N/A
# lsvg -l caavg_private;
caavg_private:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
caalv_private1 boot 1 1 1 closed/syncd N/A
caalv_private2 boot 1 1 1 closed/syncd N/A
caalv_private3 4 4 1 open/syncd N/A
powerha_crlv boot 1 1 1 closed/syncd N/A
收起我之前处理这种问题的思路是,通过日志实在查不出原因来就先打补丁,两个补丁:
1. rsct 这个是通过操作系统的tl或sp来升级的
2. powerha:ha软件的补丁
一般情况下,打完补丁就ok了,如果不好,最起码也可以排除了bug
收起另外既然是同一人部署的,那么操作系统版本和HA版本是否一样,如果不一样就更要从配置方面看看了
收起估计你遇见可能是RSCT的BUG,需要打补丁了
Jan 17 12:26:12 ykportal02 daemon:err|error last message repeated 11 times
Jan 17 12:26:18 ykportal02 daemon:err|error snmpd[3145842]: EXCEPTIONS: authentication error: invalid community name: public
Jan 17 12:26:19 ykportal02 daemon:notice StorageRM[2818162]: (Recorded using libct_ffdc.a cv 2):::Error ID: :::Reference ID: :::Template ID: a8576c0d:::Details File: :::Location: RSCT,StorageRMDaemon.C,1.63,361 :::STORAGERM_STOPPED_ST IBM.StorageRM daemon has been stopped.
Jan 17 12:26:23 ykportal02 daemon:err|error snmpd[3145842]: EXCEPTIONS: authentication error: invalid community name: public
Jan 17 12:26:24 ykportal02 daemon:notice cthags[2229060]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6/uIVc.knNTM/Ara0bgJD8....................:::Reference ID: :::Template ID: 28854e81:::Details File: :::Location: RSCT,SRCSocket.C,1.91,424 :::GS_STOP_ST Group Services daemon stopped DIAGNOSTIC EXPLANATION Exiting for STOP NORMAL request from SRC.
Jan 17 12:26:24 ykportal02 local0:crit clstrmgrES[2949508]: Tue Jan 17 12:26:24 announcementCb: Called, state=ST_STABLE, provider token 1
Jan 17 12:26:24 ykportal02 local0:crit clstrmgrES[2949508]: Tue Jan 17 12:26:24 announcementCb: GsToken 3, AdapterToken -1, rm_GsToken 1
Jan 17 12:26:24 ykportal02 local0:crit clstrmgrES[2949508]: Tue Jan 17 12:26:24 announcementCb: GRPSVCS announcment code=512; exiting
Jan 17 12:26:24 ykportal02 local0:crit clstrmgrES[2949508]: Tue Jan 17 12:26:24 CHECK FOR FAILURE OF RSCT SUBSYSTEMS (cthags)
Jan 17 12:26:24 ykportal02 daemon:notice snmpd[3145842]: NOTICE: lost peer (SMUX ::1+32807+4)
Jan 17 12:26:25 ykportal02 daemon:notice ConfigRM[2622318]: (Recorded using libct_ffdc.a cv 2):::Error ID: :::Reference ID: :::Template ID: 2625c573:::Details File: :::Location: RSCT,PeerDomain.C,1.99.22.155,24966 :::CONFIGRM_OFFLINE_ST The node is offline.
Jan 17 12:26:25 ykportal02 user:notice PowerHA SystemMirror for AIX: clexit.rc : Unexpected termination of clstrmgrES.
Jan 17 12:26:25 ykportal02 user:notice PowerHA SystemMirror for AIX: clexit.rc : Halting system immediately!!!
Jan 17 15:47:25 ykportal02 daemon:notice snmpd[1507824]: NOTICE: logging started at level 0
Jan 17 15:47:26 ykportal02 daemon:notice snmpd[1507824]: NOTICE: snmpd (1507824) is starting
Jan 17 15:47:27 ykportal02 daemon:notice snmpd[1507824]: NOTICE: stopsrc issued
Jan 17 15:47:27 ykportal02 daemon:notice snmpd[1507824]: NOTICE: snmpd (1507824) is terminating
收起