6. 重置从机
还要重置中继日志信息,因为恢复的备份将具有与原始 one229 节点关联的中继日志文件。
然后可以将节点重新加入组;
mysql> reset slave; Query OK, 0 rows affected (0.00 sec)
4.检查文件系统权限
验证刚刚恢复的文件/文件夹是否与运行mysql的用户相同,例如mysql
chown -R mysql:mysql /var/lib/mysql
示例 - 失败的节点
考虑现有的复制组;
root@one238 [(none)]> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 0f77e8be-dfc6-11e8-b5c0-0800278e6619 | one236.fitz.lan | 3306 | ONLINE | SECONDARY | 8.0.13 | | group_replication_applier | 4ebecd8b-dfd3-11e8-bed5-0800278e6619 | one223.fitz.lan | 3306 | ONLINE | SECONDARY | 8.0.13 | | group_replication_applier | 9f742055-dfd2-11e8-b021-0800271600ae | one238.fitz.lan | 3306 | ONLINE | PRIMARY | 8.0.13 | +---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec)
主节点当前是:one238.
它将被停止并删除完整的数据目录,以模拟不可恢复的节点故障,其中需要更换组件并重新安装操作系统和所有软件。
模拟如下
one236:# ~ # systemctl stop mysqld one236:# ~ # rm -rf /var/lib/mysql/* one236:# ~ #
3. 恢复
然后应根据文档恢复备份;
one236:# ~ # mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/root/my.mbi_04Nov_1521 --backup-dir=/tmp/restore_$(date '+%d%b_%H%M') copy-back-and-apply-log MySQL Enterprise Backup version 8.0.13 Linux-4.1.12-112.14.13.el7uek.x86_64-x86_64 [2018-10-07 09:31:51] Copyright (c) 2003, 2018, Oracle and/or its affiliates. All Rights Reserved. [Build ID : 13776648.3672fb2ad9f6c91cd2527866d46fa583b70bfefd] 181104 15:27:24 MAIN INFO: A thread created with Id '139810938710144' 181104 15:27:24 MAIN INFO: Starting with following command line ... mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/root/my.mbi_04Nov_1521 --backup-dir=/tmp/restore_04Nov_1527 copy-back-and-apply-log ... 181104 15:27:29 PCR1 INFO: We were able to parse ibbackup_logfile up to lsn 19410602. 181104 15:27:29 PCR1 INFO: Last MySQL binlog file position 0 19183, file name one223_3306.000002 181104 15:27:29 PCR1 INFO: The first data file is '/var/lib/mysql/ibdata1' and the new created log files are at '/var/lib/mysql/' 181104 15:27:29 MAIN INFO: No Keyring file to process. 181104 15:27:29 MAIN INFO: Apply-log operation completed successfully. 181104 15:27:29 MAIN INFO: Full Backup has been restored successfully. mysqlbackup completed OK! with 3 warnings one236:# ~ #
MySQL Shell/InnoDB 集群
如果通过 MySQL SHELL 运行 GROUP 复制,建议使用 MySQL Shell 进行节点恢复。
这部分是由于 MySQL Shell 创建了自己的用户来控制节点组内的复制并避免任何可能的手动交互/权限问题。
MySQL one236:33060+ ssl JS > cluster.status(); { "clusterName": "devCluster", "defaultReplicaSet": { "name": "default", "primary": "one236:3306", "ssl": "REQUIRED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "one223:3306": { "address": "one223:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "one236:3306": { "address": "one236:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "one238:3306": { "address": "one238:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://root@one236:3306" } MySQL one236:33060+ ssl JS >
恢复过程仍然需要手动将备份恢复到节点,通过mysqlbackup copy-back-and-apply-logs和gtid purged命令执行。
https://dev.mysql.com/doc/dev/mysqlsh-api-javascript/8.0/classmysqlsh_1_1dba_1_1_cluster.html
集群状态()
cluster.status() 输出显示失败的节点。
var i1='root:D0lph1n5@one223'; var i2='root:D0lph1n5@one236'; var i3='root:D0lph1n5@one238'; shell.connect(i1); var c=dba.getCluster();
MySQL one223:33060+ ssl JS > var c=dba.getCluster(); MySQL one223:33060+ ssl JS > c.status(); { "clusterName": "devCluster", "defaultReplicaSet": { "name": "default", "primary": "one223:3306", "ssl": "REQUIRED", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", "topology": { "one223:3306": { "address": "one223:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "one236:3306": { "address": "one236:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "(MISSING)" }, "one238:3306": { "address": "one238:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://root@one223:3306" }
cluster.rescan()
发出重新扫描可识别发生故障的节点。
MySQL one223:33060+ ssl JS > var c=dba.getCluster(); MySQL one223:33060+ ssl JS > c.rescan() Rescanning the cluster... Result of the rescanning operation: { "defaultReplicaSet": { "name": "default", "newlyDiscoveredInstances": [], "unavailableInstances": [ { "host": "one236:3306", "label": "one236:3306", "member_id": "46566e90-dfe4-11e8-8662-08002708448b" } ] } } The instance 'one236:3306' is no longer part of the HA setup. It is either offline or left the HA group. You can try to add it to the cluster again with the cluster.rejoinInstance('one236:3306') command or you can remove it from the cluster configuration. Would you like to remove it from the cluster metadata? [Y/n]: Removing instance from the cluster metadata... The instance 'one236:3306' was successfully removed from the cluster metadata. MySQL one223:33060+ ssl JS > c.status(); { "clusterName": "devCluster", "defaultReplicaSet": { "name": "default", "primary": "one223:3306", "ssl": "REQUIRED", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures.", "topology": { "one223:3306": { "address": "one223:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "one238:3306": { "address": "one238:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://root@one223:3306" }
请务必注意,此处按照手动 GR 配置执行恢复备份的相同步骤。
- 确认mysql没有运行
- 如果不为空,则删除现有的数据目录内容
- 执行还原操作
- 通过配置文件修改节点以跳过启动时的复制
- 清除 gtid
- 停止节点并删除最近的配置条目
- 起始节点
cluster.addInstance()
现在可以重新添加实例(一旦实际的 mysqld 实例重新启动)
MySQL one223:33060+ ssl JS > c.addInstance(i2); A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours. Adding instance to the cluster ... Validating instance at one236:3306... This instance reports its own address as one236.fitz.lan Instance configuration is suitable. The instance 'root@one236' was successfully added to the cluster. MySQL one223:33060+ ssl JS > c.status(); { "clusterName": "devCluster", "defaultReplicaSet": { "name": "default", "primary": "one223:3306", "ssl": "REQUIRED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "one223:3306": { "address": "one223:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "one236:3306": { "address": "one236:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "one238:3306": { "address": "one238:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://root@one223:3306" } MySQL one223:33060+ ssl JS >
独立组复制
多主:设置只读
在多主模式下,需要采取另外的预防措施来防止在 RESTORE ( meb ) 和 RECOVERY ( GR 阶段) 期间写入数据库。
通常(取决于从客户端访问组的方式)客户端可以在实例可在网络上访问时在实例上执行 DML,甚至在应用 BINLOGS 重新加入 GROUP 之前。
超级只读
当前的建议是在实例的全局配置文件中将节点配置为 super_read_only,然后再从恢复的 datadir 开始。
作为另外的检查,event_scheduler 也可以在复制被赶上时关闭。
[mysqld] super_read_only=1 event_scheduler=off
客户端中应该存在足够的错误处理,以识别在此期间它们(尽管是暂时的)被阻止执行 DML。
2. 将备份镜像复制到远程主机
one223:# /backups # scp my.mbi_04Nov_1521 one236:~/ root@one236's password: my.mbi_04Nov_1521 100% 58MB 37.2MB/s 00:01 one223:# /backups #
5. 重置 Master
启动新恢复的节点并发出重置主节点。
这是预防性的,并确保没有与本地节点在组前面相关的错误(这将阻止节点加入)
https://dev.mysql.com/doc/refman/5.7/en/group-replication-options.html#sysvar_group_replication_allow_local_disjoint_gtids_join
one231/var/lib/mysql # mysql mysql> reset master; Query OK, 0 rows affected (0.01 sec)
基础知识
对于Unix/Linux 操作系统,该文档使用 shell 扩展来简单命名带有“今天”日期+时间的文件夹,例如
shell:~ # d="$(date '+%d%b_%H%M')" shell:~ # echo $d 13Nov_1338
当然,可以根据任何现有的命名约定进行调整。
简单的高级备份/恢复命令:
创建镜像备份(建议可选地使用 -skip-binlog 以获得更有效的备份大小)
mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/backups/my.mbi_$(date '+%d%b_%H%M') --backup-dir=/backups/backup_$(date '+%d%b_%H%M') --user=root -pD0lph1n5! --host=127.0.0.1 backup-to-image
上面的命令行将被外推到(假设变量 $d 如上所述):
mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/backups/my.mbi_13Nov_1338 --backup-dir=/backups/backup_13Nov_1338 --user=root -pD0lph1n5! --host=127.0.0.1 backup-to-image
恢复备份
mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/backups/my.mbi_13Nov_1338 --backup-dir=/tmp/restore_$(date +%d%m_%H%M) copy-back-and-apply-log
7. 执行 GTID 清除
典型的 MEB 还原目录如下所示;
one236:# /tmp/restore_04Nov_1527/meta # ll total 148 -rw-r--r-- 1 root root 96644 Nov 4 15:27 backup_content.xml -rw-r--r-- 1 root root 6398 Nov 4 15:27 backup_create.xml -rw-r--r-- 1 root root 303 Nov 4 15:27 backup_gtid_executed.sql -rw-r--r-- 1 root root 932 Nov 4 15:27 backup_variables.txt -rw-r--r-- 1 root root 19217 Nov 4 15:27 image_files.xml -rw-r--r-- 1 root root 12912 Nov 4 15:27 MEB_2018-11-04.15-27-24_copy_back_img_to_datadir.log one236:# /tmp/restore_04Nov_1527/meta #
恢复的数据库需要配置其 GTID 值,以便它可以按照手册从正确的位置重放二进制日志;
https://dev.mysql.com/doc/mysql-enterprise-backup/4.1/en/advanced.slave.html
mysql>set SQL_LOG_BIN=0; mysql>source /tmp/restore_04Nov_1527/meta/backup_gtid_executed.sql; mysql>set SQL_LOG_BIN=1;
9.多主:启用读/写
一旦节点在组内重新上线,它可以重新启用为活动的 DML 节点;
mysql>set SQL_LOG_BIN=0; mysql>set global super_read_only=0; mysql>set global event_scheduler=1 mysql>set SQL_LOG_BIN=1;
8. 启动组复制(或者通过 MySQLShell addInstance() )
mysql> CHANGE MASTER TO MASTER_USER='root', MASTER_PASSWORD='D0lph1n5' FOR CHANNEL 'group_replication_recovery'; Query OK, 0 rows affected, 2 warnings (0.02 sec) mysql> START GROUP_REPLICATION; Query OK, 0 rows affected (2.11 sec)
1.备份
最近备份了一个辅助节点:
one223:# ~ # mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/backups/my.mbi_$(date '+%d%b_%H%M') --backup-dir=/backups/backup_$(date '+%d%b_%H%M') --user=root -p --host=127.0.0.1 backup-to-image MySQL Enterprise Backup version 8.0.13 Linux-4.1.12-112.14.13.el7uek.x86_64-x86_64 [2018-10-07 09:31:51] Copyright (c) 2003, 2018, Oracle and/or its affiliates. All Rights Reserved. [Build ID : 13776648.3672fb2ad9f6c91cd2527866d46fa583b70bfefd] 181104 15:21:23 MAIN INFO: A thread created with Id '140714919446656' 181104 15:21:23 MAIN INFO: Starting with following command line ... mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/backups/my.mbi_04Nov_1521 --backup-dir=/backups/backup_04Nov_1521 --user=root -pxxxxxxxxx --host=127.0.0.1 backup-to-image .... 181104 15:21:35 MAIN INFO: Group replication setup detected. 181104 15:21:35 MAIN INFO: Backup operation is running on a secondary node member. 181104 15:21:35 MAIN INFO: Trying to connect to one236.fitz.lan:3306. Backup will update the History table in this host. This entry will be replicated. 181104 15:21:35 MAIN INFO: Full Image Backup operation completed successfully. 181104 15:21:35 MAIN INFO: Backup image created successfully. 181104 15:21:35 MAIN INFO: Image Path = /backups/my.mbi_04Nov_1521 181104 15:21:35 MAIN INFO: MySQL binlog position: filename one223_3306.000002, position 19183 181104 15:21:35 MAIN INFO: GTID_EXECUTED is 46566e90-dfe4-11e8-8662-08002708448b:1-12,6031ccad-dfe4-11e8-a7ae-08002708448b:1-14 ------------------------------------------------------------ Parameters Summary ------------------------------------------------------------ Start LSN : 19410432 End LSN : 19410602 ------------------------------------------------------------ mysqlbackup completed OK! with 1 warnings one223:# ~ #
one223:# /backups # ls -lrt total 59588 drwxr-x--- 4 root root 97 Nov 4 15:21 backup_04Nov_1521 -rw-r--r-- 1 root root 61016659 Nov 4 15:21 my.mbi_04Nov_1521 one223:# /backups #
8.0 中的一个新功能是 MEB 会将备份的历史记录写入 PRIMARY 节点,即使是从 SECONDARY 获取的。
查询 mysql.backup_history 表(来自 SECONDARY 节点)以获得最近的备份,显示;
mysql> select * from mysql.backup_history order by backup_id desc limit 1\G *** 1. row *** backup_id: 15413052924422160 tool_name: mysqlbackup --defaults-file=/etc/my.cnf --backup-image=/backups/my.mbi_04Nov_1521 --backup-dir=/backups/backup_04Nov_1521 --user=root -pxxxxxxxx --host=127.0.0.1 backup-to-image start_time: 2018-11-04 15:21:32 end_time: 2018-11-04 15:21:34 binlog_pos: 19183 binlog_file: one223_3306.000002 compression_level: 0 engines: CSV:InnoDB:PERFORMANCE_SCHEMA innodb_data_file_path: ibdata1:12M:autoextend start_lsn: 19410432 end_lsn: 19410602 backup_type: FULL backup_format: IMAGE mysql_data_dir: /var/lib/mysql/ innodb_data_home_dir: innodb_log_group_home_dir: /var/lib/mysql/ innodb_log_files_in_group: 2 innodb_log_file_size: 268435456 backup_destination: /backups/backup_04Nov_1521 lock_time: 1.139 exit_state: SUCCESS last_error: NO_ERROR last_error_code: 0 start_time_utc: 1541305292197515 end_time_utc: 1541305294658281 consistency_time_utc: 1541305294650919 meb_version: 8.0.13 server_uuid: 2e9b514c-dfe4-11e8-8bb0-0800278e6619 1 row in set (0.00 sec)
如前所述,主节点崩溃并遇到不可调和的损坏。
一段时间后,节点重建,但服务器上的所有数据都丢失了。