mysql MHA主從切換問題實驗總結

阿新 • • 發佈：2019-02-04

問題一：
Fri May 27 10:01:05 2016 - [error][/apps/lib/mha/mha_manager/MHA/MasterRotate.pm, ln161] We should not start online master switch when one of connections are running long updates on the current master(10.16.24.108(10.16.24.108:3307)). Currently 1 update thread(s) are running.
Details:
{'Time' => '88270','Command' => 'Daemon','db' => undef,'Id' => '2','Info' => undef,'User' => 'event_scheduler','Progress' => '0.000','State' => 'Waiting on empty queue','Host' => 'localhost'}
Fri May 27 10:01:05 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/sh/mha/mha_manager/bin/masterha_master_switch line 53.

解決方法：
關掉event_schedule即可：
(product)[email protected] [(none)]> SET GLOBAL event_scheduler =off;
Query OK, 0 rows affected (0.00 sec)

(product)[email protected] [(none)]> Select @@event_scheduler;
+-------------------+
| @@event_scheduler |
+-------------------+
| OFF |
+-------------------+
1 row in set (0.00 sec)

(product)[email protected] [(none)]> show processlist\G
*************************** 1. row ***************************
      Id: 140
    User: repl
    Host: 10.16.24.107:44449
      db: NULL
Command: Binlog Dump
    Time: 13262
   State: Master has sent all binlog to slave; waiting for binlog to be updated
    Info: NULL
Progress: 0.000
*************************** 2. row ***************************
      Id: 141
    User: repl
    Host: 10.16.24.109:23490
      db: NULL
Command: Binlog Dump
    Time: 13254
   State: Master has sent all binlog to slave; waiting for binlog to be updated
    Info: NULL
Progress: 0.000
*************************** 3. row ***************************
      Id: 147
    User: mha
    Host: 10.16.24.108:59213
      db: NULL
Command: Query
    Time: 0
   State: init
    Info: show processlist
Progress: 0.000
3 rows in set (0.00 sec)

問題二：
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 10.16.24.108(10.16.24.108:3307)? (YES/no): yes
Fri May 27 15:19:14 2016 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [info] Checking MHA is not monitoring or doing failover..
Fri May 27 15:19:14 2016 - [info] Checking replication health on 10.16.24.107..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [info] Checking replication health on 10.16.24.109..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/ServerManager.pm, ln1218] 10.16.24.109 is not alive!
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/MasterRotate.pm, ln232] Failed to get new master!
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/sh/mha/mha_manager/bin/masterha_master_switch line 53.

解決方法：
因為10.16.24.109的/apps/conf/mha/app1.cnf中的no_master=1限制了它成為新master的可能，標識掉no_master=1後，重新線上切換成功。

問題三：
Sat May 28 09:35:06 2016 - [info] Master configurations are as below:
Master 10.16.24.109(10.16.24.109:3307), replicating from 10.16.24.108(10.16.24.108:3307)
Master 10.16.24.108(10.16.24.108:3307), replicating from 10.16.24.109(10.16.24.109:3307), read-only

Sat May 28 09:35:06 2016 - [warning] SQL Thread is stopped(no error) on 10.16.24.108(10.16.24.108:3307)
Sat May 28 09:35:06 2016 - [error][/apps/lib/mha/mha_manager/MHA/ServerManager.pm, ln726] Slave 10.16.24.107(10.16.24.107:3307) replicates from 10.16.24.108:3307, but real master is 10.16.24.109(10.16.24.109:3307)!
Sat May 28 09:35:06 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/lib/mha/mha_manager/MHA/MasterRotate.pm line 85.

解決方法：
10.16.24.108上執行：set global read_only=off;
10.16.24.109上執行：set global read_only=on;
10.16.24.107上執行：set global read_only=on;

問題四：
Sat May 28 10:00:32 2016 831853 Set read_only=0 on the new master.
Sat May 28 10:00:32 2016 832417Add vip 10.16.24.58 on eth1..
RTNETLINK answers: Operation not permitted
解決方法：
在root使用者下每個節點執行：
chmod u+s /sbin/ip

問題五：
MHA手工線上切換後，vip也漂到新主庫上，但在其它主機上用vip連線時，卻還是連到本主機的從庫上
是啥原因
解決方法：
在所有從庫上執行drop_vip.sh即可

mysql MHA主從切換問題實驗總結

mysql MHA主從切換問題實驗總結

MySQL之主從切換

mysql表分區實驗總結

Spring + Mybatis環境實現Mysql資料庫主從切換

Spring AOP實現Mysql資料庫主從切換（一主多從）

MHA 主從切換過程及日誌分析

利用mha實現mysql的主從自動切換

mysql MHA安裝搭建問題實驗總結

MySQL主從切換

mysql 主從同步實驗細解

MySQL MHA切換失敗一例

使用Mycat實現MySQL的分庫分表、讀寫分離、主從切換

互聯網金融MySQL高可用架構之-MHA故障切換

當mysql資料庫主資料庫down掉後，如果進行主從切換。

Windows上進行mysql主從資料庫同步總結

MySQL MHA高可用方案【三、主從複製】

MySQL MHA高可用方案【五、故障切換】

[超入門]使用docker做mysql主從複製實驗

高可用（負載均衡）MYSQL（讀寫分離，主從切換）

MySQL MHA 管理維護總結

mysql MHA主從切換問題實驗總結

相關推薦