After one of the servers was brought back up the other server would not connect. It "Slave_IO_Running: No" and "Seconds_Behind_Master: null" which means it was not playing nicely.
First, I went to the MySql log files, which for this server is found at /var/log/mysql.log and looked at the last few messages here by running "tail /var/log/mysql.log" from the command prompt. This revealed the error number (server_errno=1236). It also had the following info:
Got fatal error 1236: 'Client requested master to start replication from impossible position' from master when reading data from binary log
Just before this entry in the mysql.log it indicates the file and position that it is trying to read. So, with this data in hand, I headed over to the master server and took a look at the bin logs. They are located in /var/lib/mysql/. Here I took a look at the file in question using the mysqlbinlog utility. I used the following command to check out the bin log. Obviously, you'll have to replace the position and file name with the position and file indicated in your mysql.log.
mysqlbinlog --offset=128694 mysql-bin.000013
And, this is what I saw here, among other things:
Warning: this binlog was not closed properly. Most probably mysqld crashed writing it.
Well, that explains things! When the server crashed the bin log was not closed properly. This is easy to fix. Going back to the slave server, I stopped the slave, reset the bin log position and started the server.
STOP SLAVE; CHANGE MASTER TO MASTER_LOG_POS = 0; CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin.000014'; START SLAVE;
I simply pointed the slave to the start of the next bin log. It started right up with no problem.
Update 5/21/13 - An anonymous commentor made a good point about potentially losing data using the above technique. If you are in a situation like I am (master-master replication with 100's of gigs of non-critical data) this is really the only way of getting back up without significant down time. But if you are in a master-slave configuration with a manageable data set or its critical you slave doesn't miss any data, you should probably dump the master database and re-create the slave database to make sure you didn't miss anything.
15 comments:
Hey there
Thanks so much for this AWESOME article! You saved my life! :-)
nice!
Thank you so much!
nice!
Thank you so much!
Thank you Firelit,
This is very nice and good concept,I thought of doing the same when found the difference in slave status and master status but not having the knowledge of performing this,
This doc cleared concepts.
Thank you!!
Excellent!
Thank you very much!
Saludos desde Argentina!
Excellent perfect!!
great post , thanks a lot
Done
Its works
Thanks
Got fatal error 1236 FROM MASTER
WHEN reading DATA FROM BINARY LOG:
'binlog truncated in the middle of event; consider out of disk space on master;
the first event 'mysql-bin.000792' at 608112413,
the last event read from './mysql-bin.000792' at 608112413,
the last byte read from './mysql-bin.000792' at 608112640.'
STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE = 'mysql-bin.000793', MASTER_LOG_POS = 0;
Dunno if it is a possible problem, but when setting MASTER_LOG_POS = 0 before changing MASTER_LOG_FILE, the actual position seems to be adjusted in relation to the old log file. Then when the log file is changed afterwards, that adjusted position is used and not the initially set position.
Thank you! It worked for me.
As a customer it works but as a DBA I must say that this wouldnt work as it should, you should see in the binlog whats wrong
edulanka, thank you! You made my day...
Error 1236 can have various reasons, but when the reason is 'binlog truncated in the middle of event', solution provided by you works perfectly.
Thanks from Brazil! Worked fine!
hi, so thanks so much, so much, much, tahnks for your post
Thanks so much for this AWESOME article! You saved my life! :-)
Post a Comment