smartd服务说明

smartd 是一个守护进程,用于监视内置于许多 ATA-3 和更高版本的 ATA、IDE 和 SCSI-3 硬盘驱动器中的自我监控、分析和报告技术 (SMART) 系统。
SMART 的目的是监控硬盘的可靠性和预测硬盘故障,并进行不同类型的硬盘自检。
此版本的 smartd 与 ATA/ATAPI-7 和更早的标准兼容。

smartd 将尝试在 ATA 设备上启用 SMART 监控(相当于 smartctl -s on)并每 30 分钟轮询一次这些设备和 SCSI 设备(可配置),通过 SYSLOG 界面记录 SMART 错误和 SMART 属性的更改。
这些 SYSLOG 通知和警告的默认位置是 /var/log/messages。
要更改默认位置,请参阅下面描述的“-l”命令行选项。

除了记录到文件之外,还可以将 smartd 配置为在检测到问题时发送电子邮件警告。
根据问题的类型,我们可能希望在磁盘上运行自检、备份磁盘、更换磁盘或者使用制造商的实用程序来强制重新分配坏的或者不可读的磁盘扇区。
如果检测到磁盘问题,请参阅 smartctl 手册页和 smartmontools 网页/FAQ 以获取进一步指导。

配置smartd

RPM 包:

smartmontools-[version]-[release]

配置文件

/etc/smartd.conf     ### For CentOS/RHEL 5,6
/etc/smartmontools/smartd.conf .   ### For CentOS/RHEL 7

配置文件 /etc/smartmontools/smartd.conf 示例:

# cat /etc/smartmontools/smartd.conf 
# Sample configuration file for smartd.  See man smartd.conf.
# Home page is: http://smartmontools.sourceforge.net
# $Id: smartd.conf 3651 2012-10-18 15:11:36Z samm2 $
# smartd will re-read the configuration file if it receives a HUP
# signal
# The file gives a list of devices to monitor using smartd, with one
# device per line. Text after a hash (#) is ignored, and you may use
# spaces and tabs for white space. You may use '\' to continue lines.
# You can usually identify which hard disks are on your system by
# looking in /proc/ide and in /proc/scsi.
# The word DEVICESCAN will cause any remaining lines in this
# configuration file to be ignored: it tells smartd to scan for all
# ATA and SCSI devices.  DEVICESCAN may be followed by any of the
# Directives listed below, which will be applied to all devices that
# are found.  Most users should comment out DEVICESCAN and explicitly
# list the devices that they wish to monitor.
DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q
# Alternative setting to ignore temperature and power-on hours reports
# in syslog.
#DEVICESCAN -I 194 -I 231 -I 9
# Alternative setting to report more useful raw temperature in syslog.
#DEVICESCAN -R 194 -R 231 -I 9
# Alternative setting to report raw temperature changes >= 5 Celsius
# and min/max temperatures.
#DEVICESCAN -I 194 -I 231 -I 9 -W 5
# First (primary) ATA/IDE hard disk.  Monitor all attributes, enable
# automatic online data collection, automatic Attribute autosave, and
# start a short self-test every day between 2-3am, and a long self test
# Saturdays between 3-4am.
#/dev/hda -a -o on -S on -s (S/../.././02|L/../../6/03)
# Monitor SMART status, ATA Error Log, Self-test log, and track
# changes in all attributes except for attribute 194
#/dev/hdb -H -l error -l selftest -t -I 194 
# Monitor all attributes except normalized Temperature (usually 194),
# but track Temperature changes >= 4 Celsius, report Temperatures
# >= 45 Celsius and changes in Raw value of Reallocated_Sector_Ct (5).
# Send mail on SMART failures or when Temperature is >= 55 Celsius.
#/dev/hdc -a -I 194 -W 4,45,55 -R 5 -m admin@example.com
# An ATA disk may appear as a SCSI device to the OS. If a SCSI to
# ATA Translation (SAT) layer is between the OS and the device then
# this can be flagged with the '-d sat' option. This situation may
# become common with SATA disks in SAS and FC environments.
# /dev/sda -a -d sat
# A very silent check.  Only report SMART health status if it fails
# But send an email in this case
#/dev/hdc -H -C 0 -U 0 -m admin@example.com
# First two SCSI disks.  This will monitor everything that smartd can
# monitor.  Start extended self-tests Wednesdays between 6-7pm and
# Sundays between 1-2 am
#/dev/sda -d scsi -s L/../../3/18
#/dev/sdb -d scsi -s L/../../7/01
# Monitor 4 ATA disks connected to a 3ware 6/7/8000 controller which uses
# the 3w-xxxx driver. Start long self-tests Sundays between 1-2, 2-3, 3-4, 
# and 4-5 am.
# NOTE: starting with the Linux 2.6 kernel series, the /dev/sdX interface
# is DEPRECATED.  Use the /dev/tweN character device interface instead.
# For example /dev/twe0, /dev/twe1, and so on.
#/dev/sdc -d 3ware,0 -a -s L/../../7/01
#/dev/sdc -d 3ware,1 -a -s L/../../7/02
#/dev/sdc -d 3ware,2 -a -s L/../../7/03
#/dev/sdc -d 3ware,3 -a -s L/../../7/04
# Monitor 2 ATA disks connected to a 3ware 9000 controller which
# uses the 3w-9xxx driver (Linux, FreeBSD). Start long self-tests Tuesdays
# between 1-2 and 3-4 am.
#/dev/twa0 -d 3ware,0 -a -s L/../../2/01
#/dev/twa0 -d 3ware,1 -a -s L/../../2/03
# Monitor 2 SATA (not SAS) disks connected to a 3ware 9000 controller which
# uses the 3w-sas driver (Linux). Start long self-tests Tuesdays
# between 1-2 and 3-4 am.
# On FreeBSD /dev/tws0 should be used instead
#/dev/twl0 -d 3ware,0 -a -s L/../../2/01
#/dev/twl0 -d 3ware,1 -a -s L/../../2/03
# Same as above for Windows. Option '-d 3ware,N' is not necessary,
# disk (port) number is specified in device name.
# NOTE: On Windows, DEVICESCAN works also for 3ware controllers.
#/dev/hdc,0 -a -s L/../../2/01
#/dev/hdc,1 -a -s L/../../2/03
# Monitor 3 ATA disks directly connected to a HighPoint RocketRAID. Start long
# self-tests Sundays between 1-2, 2-3, and 3-4 am. 
#/dev/sdd -d hpt,1/1 -a -s L/../../7/01
#/dev/sdd -d hpt,1/2 -a -s L/../../7/02
#/dev/sdd -d hpt,1/3 -a -s L/../../7/03
# Monitor 2 ATA disks connected to the same PMPort which connected to the
# HighPoint RocketRAID. Start long self-tests Tuesdays between 1-2 and 3-4 am
#/dev/sdd -d hpt,1/4/1 -a -s L/../../2/01
#/dev/sdd -d hpt,1/4/2 -a -s L/../../2/03
# HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE.
# PLEASE SEE THE smartd.conf MAN PAGE FOR DETAILS
#
#   -d TYPE Set the device type: ata, scsi, marvell, removable, 3ware,N, hpt,L/M/N
#   -T TYPE set the tolerance to one of: normal, permissive
#   -o VAL  Enable/disable automatic offline tests (on/off)
#   -S VAL  Enable/disable attribute autosave (on/off)
#   -n MODE No check. MODE is one of: never, sleep, standby, idle
#   -H      Monitor SMART Health Status, report if failed
#   -l TYPE Monitor SMART log.  Type is one of: error, selftest
#   -f      Monitor for failure of any 'Usage' Attributes
#   -m ADD  Send warning email to ADD for -H, -l error, -l selftest, and -f
#   -M TYPE Modify email warning behavior (see man page)
#   -s REGE Start self-test when type/date matches regular expression (see man page)
#   -p      Report changes in 'Prefailure' Normalized Attributes
#   -u      Report changes in 'Usage' Normalized Attributes
#   -t      Equivalent to -p and -u Directives
#   -r ID   Also report Raw values of Attribute ID with -p, -u or -t
#   -R ID   Track changes in Attribute ID Raw value with -p, -u or -t
#   -i ID   Ignore Attribute ID for -f Directive
#   -I ID   Ignore Attribute ID for -p, -u or -t Directive
#   -C ID   Report if Current Pending Sector count non-zero
#   -U ID   Report if Offline Uncorrectable count non-zero
#   -W D,I,C Monitor Temperature D)ifference, I)nformal limit, C)ritical limit
#   -v N,ST Modifies labeling of Attribute N (see man page)
#   -a      Default: equivalent to -H -f -t -l error -l selftest -C 197 -U 198
#   -F TYPE Use firmware bug workaround. Type is one of: none, samsung
#   -P TYPE Drive-specific presets: use, ignore, show, showall
#    #      Comment: text after a hash sign is ignored
#    \      Line continuation character
# Attribute ID is a decimal integer 1 <= ID <= 255
# except for -C and -U, where ID = 0 turns them off.
# All but -d, -m and -M Directives are only implemented for ATA devices
#
# If the test string DEVICESCAN is the first uncommented text
# then smartd will scan for devices /dev/hd[a-l] and /dev/sd[a-z]
# DEVICESCAN may be followed by any desired Directives.
on  it road.com

服务管理

Init.d 脚本位置:

/etc/init.d/smartd

“chkconfig -list smartd”示例

# chkconfig --list smartd
smartd          0:off   1:off   2:on    3:on    4:on    5:on    6:off

可用的服务使用选项

# service smartd
Usage: /etc/init.d/smartd {start|stop|reload|report|restart|status}
# service smartd start
Starting smartd:                                           [  OK  ]
# service smartd stop
Shutting down smartd:                                      [  OK  ]
# service smartd status
smartd (pid 4061 2857) is running...
# service smartd restart
Shutting down smartd:                                      [  OK  ]
Starting smartd:                                           [  OK  ]
# service smartd reload
Reloading smartd daemon configuration:                     [  OK  ]
# service smartd report
Checking SMART devices now:                                [  OK  ]

它运行哪些守护进程:

/usr/sbin/smartd
Linux 操作系统“smartd”服务
日期:2020-09-17 00:14:38 来源:oir作者:oir