MySQL replication and failover with tungsten-replicator

Having administered MySQL in production for the last 7 years I know how painful it is when a master goes down or when replication breaks.Its also annoying when you rebuild a slave and it takes a really long time for replication to catch up.I decided to give tungsten-replication a look as it claims to be high performance and handles promoting a slave to master.

My setup:

—Two rackspace cloud servers running Ubuntu 12.04

—2GB RAM

—4 cores

—Percona server 5.5

—tungsten-replicator 2.0.5 (free version)

Installation

It took me a while( ~4 hours) to satisfy all of the prerequisites for the OS and MySQL.This wasn’t helped by the mostly horrible documentation. All of the information was there but it was laid out all over the place and took several attempts to find all the pieces.It would have been a huge help to have a script that you can run to check your prerequisites such as ruby version, java version etc….Once I satisfied all the prerequisites it was time to move on to the config

Configuration

The good news here is that tungsten-replication comes with a scripted installation designed to install and configure all of your nodes with one command.The bad news once again is that the documentation is scattered all over the place and really incomplete.Also a really annoying problem is getting to the help output with the commands.Some require —help, some help and others have nothing at all.In the open source version there are 2 types of configuration.1. master-slave2. direct which supports multi-master

master-slave

To get master slave up I downloaded the release:

wget http://code.google.com/p/tungsten-replicator/downloads/detail?name=tungsten-replicator-2.0.5.tar.gz&can=2&q=

untarred it:

tar xzvf tungsten-replicator-2.0.5.tar.gz

cd to tungsten-replicator-2.0.5 and ran the following command:

./tools/tungsten-installer -v       --master-slave       --master-host=198.201.208.173       --datasource-user=tungsten       --datasource-password=tungsten       --datasource-mysql-conf=/etc/mysql/my.cnf       --service-name=gabriel       --home-directory=/opt/tungsten       --cluster-hosts=198.201.208.173,198.201.206.221       --start-and-report

This should install tungsten-replicator to /opt/tungsten start the service on both nodes and report back with its status.To check create a database on the master, it should show up on the slave.

Replication Performance

Now that I have my master-slave up and running I decided to test the replication speed.I took a 40MB dump file and loaded it onto the master.It took 6 seconds to load.What surprised me was how long it took to complete the load on the slave.It took 43 seconds for the slave to get the complete dump, that was 43 seconds after the master load completed.To eliminate the network as a bottle neck I scp’d the file from master to slave and that took 6 seconds.My guess is the process on the master that reformates the statements with global id’s is really inefficient.

failover

I still haven’t figured out how to automatically failover( good luck looking in the docs), but I did figure out how to manually promote the slave to master.First thing you need to do it take replication offline, that doesn’t mean stopping the daemon but just pausing the replication process:

/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel offline

This needs to be run on both hosts.

To verify that both hosts have replication offline run the following:

/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl status

Now your ready, in my environment 198.201.208.173 is the current master and 198.201.206.221 the slave( you can see this in the output of the status command).The first step is to change the master to a slave:

/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel setrole  -role slave -uri thl://198.201.206.221/

run that on the current master

Then turn the old slave on to a master:/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel setrole -role master -uri thl://198.201.206.221/Run this on the current slave.

Now check your work with the status command.When it looks good run this command on both hosts:

/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel online

Again check your work with the status command.

protip

To run commands on both hosts is easy as in the host setup process you copied over ssh keys and authorized_keys files.I just used a bash for loop:

for H in 198.201.208.173 198.201.206.221;  do ssh root@$H "/root/tungsten-replicator-2.0.5/tungsten-replicator/bin/trepctl status"; done

conclusion

Given the replication lag, poor documentation, long setup, and the fact that its really early I consider tungsten-replicator to have a lot of potential but mostly a science experiment for now.Of course this is with the free version.Like I said in the beginning I know how much it sucks when a master fails of when replication breaks.Tungsten-replicator has all of the features needed to solve all of these problems it just doesn’t give me enough confidence to run this in my production environment…..yet

About these ads

3 thoughts on “MySQL replication and failover with tungsten-replicator

  1. Please notice that Tungsten Replicator does not support automatic failover. That is a feature of Tungsten Enterprise. (http://continuent.com/solutions/featurematrix)There is a cookbook about installation that should cover most of your needs: http://code.google.com/p/tungsten-replicator/wiki/TungstenReplicatorCookbookIt's very strange your claim that it takes 43 seconds to transfer 40 MB of data.Here is what I did, with a vanilla installation of Tungsten on 4 hosts (called r1, r2, r3,and r4, with r1 as the master).I loaded the employees test database (http://launchpad.net/test-db) which is 160 MB. The salaries table alone, which is the last to be loaded, is 111 MB. mysql -h r1 < employees.sql ; sleep 1; for H in r1 r2 r3 r4 ; do mysql -h $H -e ‘select count(*) from employees.salaries’; doneINFOCREATING DATABASE STRUCTUREINFOstorage engine: InnoDBINFOLOADING departmentsINFOLOADING employeesINFOLOADING dept_empINFOLOADING dept_managerINFOLOADING titlesINFOLOADING salaries+———-+| count(*) |+———-+| 2844047 |+———-++———-+| count(*) |+———-+| 2844047 |+———-++———-+| count(*) |+———-+| 2844047 |+———-++———-+| count(*) |+———-+| 2844047 |+———-+1 second after I loaded the data, all the records for the salaries table are already in all the servers,

  2. One more thing. You said" It would have been a huge help to have a script that you can run to check your prerequisites such as ruby version, java version etc…"This is what the installer does. And if you just want the validation without the actual installation, you can add this option to the installation command:–validate-only

  3. Thanks for the wonderful article! I have a query. As the number of bin log files increases, i want to know if I can archive the old bin log files. Will this effect Tungsten Replication? Please let me know.Thanks,Vinayak!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s