什么是主机电源管理(Fence)?
配置电源管理后,RHV 可以重新启动处于 NonOperational 或者 NonResponsive 状态的主机。
RHV 支持以下电源管理设备:
- American Power Conversion (apc)
- IBM Bladecenter (Bladecenter)
- Cisco Unified Computing System (cisco_ucs)
- Dell Remote Access Card 5 (drac5)
- Dell Remote Access Card 7 (drac7)
- Electronic Power Switch (eps)
- HP BladeSystem (hpblade)
- Integrated Lights Out (ilo, ilo2, ilo3, ilo4,ilo_ssh)
- Intelligent Platform Management Interface (ipmilan)
- Remote Supervisor Adapter (rsa)
- Fujitsu-Siemens RSB (rsb)
- Western Telematic, Inc (wti)
RHV 使用隔离代理与电源管理设备进行通信。
什么是自动围列(Auto Fencing)?
当主机遇到意外故障时,主机状态将更改为正在连接,并且主机将在宽限期内处于此状态。
如果超时,主机将变为“非响应”或者“非操作”状态。
为了对该状态做出反应,引擎通过执行重新启动来隔离有问题的主机。
引擎使用主机上的电源管理卡的隔离代理来停止主机,确认已停止,启动主机,确认主机已启动。
自动围列宽限期:
默认情况下,引擎会尝试两次向 vdsm 询问状态:
option_name | option_value | default_value -------------------------+--------------+-------------- VDSAttemptsToResetCount | 2 | 2 (1 row)
Grace Period = TimeoutToResetVdsInSeconds + DelayResetPerVmInSeconds*(Number of VMs on host) + DelayResetForSpmInSeconds(isSPM)
例如,如果 Host 是具有两个 VM 和 default_value 的 SPM,则宽限期= 60+ 0.5*2+20=81s
option_name | option_value | default_value ----------------------------+--------------+-------------- TimeoutToResetVdsInSeconds | 60 | 60 DelayResetForSpmInSeconds | 20 | 20 DelayResetPerVmInSeconds | 0.5 | 0.5 VDSAttemptsToResetCount | 2 | 2 (4 rows)
Kdump围列:
当启用“Kdump 集成”时,它只会延迟硬栅列直到主机完成写入其内存转储以防崩溃。
Soft Fence:
这可以从集群级别配置:
AdminPortal-->Compute-->Cluster-->Edit Cluster-->Fencing Polciy-->Enable Fencing
在“重新启动”主机之前,引擎尝试通过“SSH 软防护”在“无响应”主机上通过 SSH 重新启动 VDSM。
option_name | option_value | default_value | version -----------------------+-------------------------------------------------+------------------------------------------------+-------- SshSoftFencingCommand | /usr/bin/vdsm-tool service-restart vdsmd | /usr/bin/vdsm-tool service-restart vdsmd | 4.3 (1 row)
可以在未配置电源管理的主机上执行基于 SSH 的软防护。
这与“围列”不同。
防护只能在配置了电源管理的主机上执行。
选择代理
默认的电源管理代理首选项是集群,dc。
有一个选项可以添加“other_dc”。
它将找到处于“UP”状态的代理主机。
# engine-config -g FenceProxyDefaultPreferences FenceProxyDefaultPreferences: cluster,dc version: general
流动:
引擎流程:
配置元数据
以下是 VdsFenceType、VdsFenceOptionTypes、VdsFenceOptionMapping、FenceAgentMapping 和 FenceAgentDefaultParams 的元数据:
-[ RECORD 1 ]+--------------------------------------------------------------------------------------------------------- option_name | VdsFenceType option_value | apc,apc_snmp,bladecenter,cisco_ucs,drac5,drac7,eps,hpblade,ilo,ilo2,ilo3,ilo4,ilo_ssh,ipmilan,rsa,rsb,wti version | 4.3 -[ RECORD 2 ]-+-------------------------------------------------------------------------------------------------------- option_name | VdsFenceOptionTypes option_value | encrypt_options=bool,secure=bool,port=int,slot=int default_value | encrypt_options=bool,secure=bool,port=int,slot=int -[ RECORD 3 ]-+--------------------------------------------------------------------------------------------------------- option_name | VdsFenceOptionMapping option_value | apc:secure=secure,port=ipport,slot=port; apc_snmp:port=port,encrypt_options=encrypt_options; bladecenter:secure=secure,port=ipport,slot=port; cisco_ucs:secure=ssl,slot=port; drac5:secure=secure,slot=port; drac7:;eps:slot=port; hpblade:port=port; ilo:secure=ssl,port=ipport; ipmilan:; ilo2:secure=ssl,port=ipport; ilo3:; ilo4:; ilo_ssh:port=port; rsa:secure=secure,port=ipport; rsb:;wti:secure=secure,port=ipport,slot=port default_value | apc:secure=secure,port=ipport,slot=port; apc_snmp:port=port,encrypt_options=encrypt_options; bladecenter:secure=secure,port=ipport,slot=port; cisco_ucs:secure=ssl,slot=port; drac5:secure=secure,slot=port; drac7:; eps:slot=port; hpblade:port=port; ilo:secure=ssl,port=ipport; ipmilan:; ilo2:secure=ssl,port=ipport; ilo3:; ilo4:; ilo_ssh:port=port; rsa:secure=secure,port=ipport; rsb:; wti:secure=secure,port=ipport,slot=port -[ RECORD 4 ]-+--------------------------------------------------------------------------------------------- option_name | FenceAgentMapping option_value | drac7=ipmilan,ilo2=ilo default_value | drac7=ipmilan,ilo2=ilo -[ RECORD 5 ]-+---------------------------------------------------------------------------------------------- option_name | FenceAgentDefaultParams option_value | drac7:privlvl=OPERATOR,lanplus=1,delay=10;ilo3:power_wait=4;ilo4:power_wait=4;ilo_ssh:secure=1 default_value | drac7:privlvl=OPERATOR,lanplus=1,delay=10;ilo3:power_wait=4;ilo4:power_wait=4;ilo_ssh:secure=1
以上元数据可以在engine-config中配置:
# engine-config -a |grep 'CustomFence\|CustomVdsFence' CustomFenceAgentMapping: version: general CustomFenceAgentDefaultParams: version: general CustomFenceAgentDefaultParamsForPPC: version: general CustomVdsFenceOptionMapping: version: general CustomVdsFenceType: version: general CustomFencePowerWaitParam: version: general
其他配置(超时和重试):
# engine-config -a |grep 'FenceStart\|FenceStop' FenceStartStatusRetries: 18 version: general FenceStartStatusDelayBetweenRetriesInSec: 10 version: general FenceStopStatusRetries: 18 version: general FenceStopStatusDelayBetweenRetriesInSec: 10 version: general