MySQL 5.7 基於復制線程SQL_Thread加快恢復的嘗試

阿新 • • 發佈：2017-10-24

復制 verify 比較 stat _id form ica xxx ror

1. MySQL 數據恢復常用辦法

MySQL恢復的方法一般有三種：

1. 官方推薦的基於全備+binlog ，通常做法是先恢復最近一次的全備，然後通過mysqlbiinlog --start-position --stop-position binlog.000xxx | mysql -uroot -p xxx -S database 恢復到目標數據庫做恢復

2. 基於主從同步恢復數據，通常做法是先恢復最近一次的全備，然後恢復後的實例做slave 掛載到現有的master 上面，通過 start slave sql_thread until master_log_pos 恢復到故障前的一個pos。

現在嘗試第三種恢復方式，通過原來主庫上面的binlog 把數據都恢復到slave 上。

處理思路：

因為relaylog和binlog本質實際上是一樣的，所以是否可以利用MySQL自身的sql_thread來增量binlog

1）重新初始化一個實例，恢復全量備份文件。
2）找到第一個binlog文件的position，和剩下所有的binlog。
3）將binlog偽裝成relaylog，通過sql thread增量恢復。

應用場景：

1. 最近的一次全備離故障位置比較遠，通過上面兩種方式的恢復時間太慢

2. 雙主keepalived的集群，由於keepalived沒有像MHA 那樣有日誌補全機制，出故障是有可能會有數據丟失的，萬一同步有嚴重的復制延時出現故障切換到slave，這樣數據就不一致，需要做日誌補全

2. 實驗步驟

1. 建立基於主從同步（這裏實驗基於傳統的pos，其實GTID 也一樣可行）

M1 ：

root@localhost:mysql3307.sock [(none)]>select * from restore.t1;
+----+------+
| id | c1   |
+----+------+
|  1 | 1    |
|  2 | 3    |
|  3 | 2    |
|  4 | 3    |
|  5 | 6    |
|  6 | 7    |
|  7 | 9    |
| 10 | NULL |
| 11 | 10   |
+----+------+
9 rows in set (0.00 sec)

　M2：（slave）

root@localhost:mysql3307.sock [(none)]>select * from restore.t1;
+----+------+
| id | c1   |
+----+------+
|  1 | 1    |
|  2 | 3    |
|  3 | 2    |
|  4 | 3    |
|  5 | 6    |
|  6 | 7    |
|  7 | 9    |
| 10 | NULL |
| 11 | 10   |
+----+------+
9 rows in set (0.00 sec)

root@localhost:mysql3307.sock [restore]>show slave status\G	
*************************** 1. row ***************************	
               Slave_IO_State: Waiting for master to send event	
                  Master_Host: m1	
                  Master_User: repl	
                  Master_Port: 3307	
                Connect_Retry: 60	
              Master_Log_File: 3307-binlog.000002	
          Read_Master_Log_Pos: 154	
               Relay_Log_File: M2-relay-bin.000004	
                Relay_Log_Pos: 371	
        Relay_Master_Log_File: 3307-binlog.000002	
             Slave_IO_Running: Yes	
            Slave_SQL_Running: Yes	
              Replicate_Do_DB: 	
          Replicate_Ignore_DB: 	
           Replicate_Do_Table: 	
       Replicate_Ignore_Table: 	
      Replicate_Wild_Do_Table: 	
  Replicate_Wild_Ignore_Table: 	
                   Last_Errno: 0	
                   Last_Error: 	
                 Skip_Counter: 0	
          Exec_Master_Log_Pos: 154	
              Relay_Log_Space: 624	
              Until_Condition: None	
               Until_Log_File: 	
                Until_Log_Pos: 0	
           Master_SSL_Allowed: No	
           Master_SSL_CA_File: 	
           Master_SSL_CA_Path: 	
              Master_SSL_Cert: 	
            Master_SSL_Cipher: 	
               Master_SSL_Key: 	
        Seconds_Behind_Master: 0	
Master_SSL_Verify_Server_Cert: No	
                Last_IO_Errno: 0	
                Last_IO_Error: 	
               Last_SQL_Errno: 0	
               Last_SQL_Error: 	
  Replicate_Ignore_Server_Ids: 	
             Master_Server_Id: 13307	
                  Master_UUID: afeab8d6-b871-11e7-9b2a-005056b643b3	
             Master_Info_File: /data/mysql/3307/data/master.info	
                    SQL_Delay: 0	
          SQL_Remaining_Delay: NULL	
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates	
           Master_Retry_Count: 86400	
                  Master_Bind: 	
      Last_IO_Error_Timestamp: 	
     Last_SQL_Error_Timestamp: 	
               Master_SSL_Crl: 	
           Master_SSL_Crlpath: 	
           Retrieved_Gtid_Set: 	
            Executed_Gtid_Set: 	
                Auto_Position: 0	
         Replicate_Rewrite_DB: 	
                 Channel_Name: 	
           Master_TLS_Version: 	
1 row in set (0.00 sec)

　記錄此時slave 的 relay-log 信息

[root@M2 data]# more M2-relay-bin.index 
./M2-relay-bin.000003
./M2-relay-bin.000004

[root@M2 data]# more relay-log.info 
7
./M2-relay-bin.000004
371
3307-binlog.000002
154
0
0
1

　2. 使用sysbench 模擬數據不同步

[root@M1 logs]# mysqladmin create sbtest

[root@M1 sysbench]# sysbench --db-driver=mysql --mysql-host=m1 --mysql-port=3307 --mysql-user=sbtest --mysql-password=‘sbtest‘ /usr/share/sysbench/oltp_common.lua --tables=4 --table-size=100000 --threads=2 --time=60 --report-interval=10 prepare

　　在主庫導入數據的時候在slave端停止同步，制造數據不一致

root@localhost:mysql3307.sock [mysql]>stop slave

　3. 等sysbench執行完，查看主庫的數據和slave 的數據

主庫：

root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest1;
+----------+
| count(1) |
+----------+
|   100000 |
+----------+
1 row in set (0.05 sec)

root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest2;
+----------+
| count(1) |
+----------+
|   100000 |
+----------+
1 row in set (0.05 sec)

root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest3;
+----------+
| count(1) |
+----------+
|   100000 |
+----------+
1 row in set (0.05 sec)

root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest4;
+----------+
| count(1) |
+----------+
|   100000 |
+----------+
1 row in set (0.05 sec)

　　slave 端：

root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest4;
+----------+
| count(1) |
+----------+
|    67550 |
+----------+
1 row in set (0.06 sec)

root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest3;
+----------+
| count(1) |
+----------+
|    70252 |
+----------+
1 row in set (0.04 sec)

　　可以看到主從不同步。

4. 此時查看slave 的status：

root@localhost:mysql3307.sock [(none)]>show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: m1
                  Master_User: repl
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: 3307-binlog.000002
          Read_Master_Log_Pos: 76364214
               Relay_Log_File: M2-relay-bin.000004
                Relay_Log_Pos: 64490301
        Relay_Master_Log_File: 3307-binlog.000002
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 64490084
              Relay_Log_Space: 76364861
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 0
                  Master_UUID: afeab8d6-b871-11e7-9b2a-005056b643b3
             Master_Info_File: /data/mysql/3307/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

　由於本地的relay log 沒有執行完畢，為了保證實驗準確性，我們先讓本地的relaylog 執行完 , start slave sql_thread

再次檢查：

*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: m1
                  Master_User: repl
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: 3307-binlog.000002
          Read_Master_Log_Pos: 76364214
               Relay_Log_File: M2-relay-bin.000005
                Relay_Log_Pos: 4
        Relay_Master_Log_File: 3307-binlog.000002
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 76364214
              Relay_Log_Space: 154
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 0
                  Master_UUID: afeab8d6-b871-11e7-9b2a-005056b643b3
             Master_Info_File: /data/mysql/3307/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

　　本地relaylog 已經全部執行完畢，此時記錄最新的relay log 信息：

[root@M2 data]# more relay-log.info 
7
./M2-relay-bin.000005
4
3307-binlog.000002
76364214
0
0
1

0
0
1

　　上面這個信息很重要，說明了從庫執行到主庫的000002 的binlog的76364214 這個位置，我們下面將主庫的binlog 拷貝過來模擬relaylog，並從這個位置開始恢復

5. 拷貝binlog 到目標端，並模擬成relay log

拷貝前先關閉從庫，並修改cnf (skip-slave-start)讓slave 不會重啟後自動開始復制

[root@M2 data]# ll
total 185248
-rw-r----- 1 root  root       461 Oct 24 17:14 3307-binlog.000001
-rw-r----- 1 root  root  76364609 Oct 24 17:14 3307-binlog.000002
-rw-r----- 1 root  root       203 Oct 24 17:14 3307-binlog.000003
-rw-r----- 1 root  root       419 Oct 24 17:14 3307-binlog.000004
-rw-r----- 1 root  root       164 Oct 24 17:14 3307-binlog.index
-rw-r----- 1 mysql mysql       56 Oct 24 15:08 auto.cnf
-rw-r----- 1 mysql mysql     4720 Oct 24 17:14 ib_buffer_pool
-rw-r----- 1 mysql mysql 12582912 Oct 24 17:14 ibdata1
-rw-r----- 1 mysql mysql 50331648 Oct 24 17:14 ib_logfile0
-rw-r----- 1 mysql mysql 50331648 Oct 24 17:11 ib_logfile1
-rw-r----- 1 mysql mysql      177 Oct 24 17:14 M2-relay-bin.000005
-rw-r----- 1 mysql mysql       22 Oct 24 17:11 M2-relay-bin.index
-rw-r----- 1 mysql mysql      122 Oct 24 17:14 master.info
drwxr-x--- 2 mysql mysql     4096 Oct 24 15:07 mysql
-rw------- 1 root  root         0 Oct 24 15:08 nohup.out
drwxr-x--- 2 mysql mysql     4096 Oct 24 15:07 performance_schema
-rw-r----- 1 mysql mysql       68 Oct 24 17:14 relay-log.info
drwxr-x--- 2 mysql mysql     4096 Oct 24 15:07 restore
drwxr-x--- 2 mysql mysql     4096 Oct 24 16:47 sbtest
drwxr-x--- 2 mysql mysql    12288 Oct 24 15:07 sys
-rw-r----- 1 mysql mysql       24 Oct 24 15:07 xtrabackup_binlog_pos_innodb
-rw-r----- 1 mysql mysql      577 Oct 24 15:07 xtrabackup_info

　改名為relay log

[root@M2 data]# cp 3307-binlog.000001 relay.000001
[root@M2 data]# cp 3307-binlog.000002 relay.000002
[root@M2 data]# cp 3307-binlog.000003 relay.000003
[root@M2 data]# cp 3307-binlog.000004 relay.000004

改權限屬性

[root@M2 data]# chown mysql.mysql -R *

　修改relay log index 文件，讓系統能識別

[root@M2 data]# cat M2-relay-bin.index	
	./relay.000001
	./relay.000002
	./relay.000003
	./relay.000004

　修改relay log info 文件，告訴系統從哪個位置開始復制

[root@M2 data]# cat relay-log.info	
	7
	./relay.000002
	76364214
	3307-binlog.000002
	76364214
	0
	0
	1
	
	0
	0
	1

　最後開起sql_thread 進程開始快速恢復

start slave sql_thread

　6. 檢查數據是否一致

slave:

oot@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest4;
+----------+
| count(1) |
+----------+
|   100000 |
+----------+
1 row in set (0.05 sec)

root@localhost:mysql3307.sock [sbtest]>select count(1) from sbtest3;
+----------+
| count(1) |
+----------+
|   100000 |
+----------+
1 row in set (0.05 sec)

　可以看到slave 已經把缺失的數據都全部恢復了。

MySQL 5.7 基於復制線程SQL_Thread加快恢復的嘗試

復制 verify 比較 stat _id form ica xxx ror 1. MySQL 數據恢復常用辦法 MySQL恢復的方法一般有三種： 1. 官方推薦的基於全備+binlog ，通常做法是先恢復最近一次的全備，然後通過mysqlbiinlog --start-

MySQL 5.7 基於復制線程SQL_Thread加快恢復的嘗試

1. MySQL 數據恢復常用辦法

2. 實驗步驟

MySQL 5.7 基於復制線程SQL_Thread加快恢復的嘗試

通過 mysqldump 搭建基於 gtid MySQL 5.7 主從復制

MySQL 5.7 主從復制(主從同步)

MySQL 5.7 並行復制

完全解決 MySQL 5.7 主從復制的延遲問題

MySQL 5.7 延遲復制環境搭建和測試

MySQL 5.7基於GTID復制的常見問題和修復步驟(一)

MySQL 5.7基於GTID復制的常見問題和修復步驟(二)

切換-5.7-傳統復制切換成GTID復制

mysql 5.7 二進制安裝

DAY7-剖析復制線程

mysql 5.7 基於GTID 主從同步的1236故障處理（其它事務故障等同）

使用 mysqldump 實現 MySQL 5.7 基於時間點的恢復

MySQL 5.7基於GTID複製的常見問題和修復步驟(一)

MySQL 5.7基於GTID複製的常見問題和修復步驟(二)

Linux MySQL 5.7二進制小版本升級

MySQL 5.7在線設置復制過濾【轉】

MySQL 5.7 新特性之增強半同步復制

MySQL 5.7下主從復制延遲解決方案

MySQL 5.7並發復制和mysqldump相互阻塞引起的復制延遲

MySQL 5.7 基於復制線程SQL_Thread加快恢復的嘗試

1. MySQL 數據恢復常用辦法

2. 實驗步驟

相關推薦