Having administered MySQL in production for the last 7 years I know how painful it is when a master goes down or when replication breaks.Its also annoying when you rebuild a slave and it takes a really long time for replication to catch up.I decided to give tungsten-replication a look as it claims to be high performance and handles promoting a slave to master.
—Two rackspace cloud servers running Ubuntu 12.04
—Percona server 5.5
—tungsten-replicator 2.0.5 (free version)
It took me a while( ~4 hours) to satisfy all of the prerequisites for the OS and MySQL.This wasn’t helped by the mostly horrible documentation. All of the information was there but it was laid out all over the place and took several attempts to find all the pieces.It would have been a huge help to have a script that you can run to check your prerequisites such as ruby version, java version etc….Once I satisfied all the prerequisites it was time to move on to the config
The good news here is that tungsten-replication comes with a scripted installation designed to install and configure all of your nodes with one command.The bad news once again is that the documentation is scattered all over the place and really incomplete.Also a really annoying problem is getting to the help output with the commands.Some require —help, some help and others have nothing at all.In the open source version there are 2 types of configuration.1. master-slave2. direct which supports multi-master
To get master slave up I downloaded the release:
tar xzvf tungsten-replicator-2.0.5.tar.gz
cd to tungsten-replicator-2.0.5 and ran the following command:
./tools/tungsten-installer -v --master-slave --master-host=188.8.131.52 --datasource-user=tungsten --datasource-password=tungsten --datasource-mysql-conf=/etc/mysql/my.cnf --service-name=gabriel --home-directory=/opt/tungsten --cluster-hosts=184.108.40.206,220.127.116.11 --start-and-report
This should install tungsten-replicator to /opt/tungsten start the service on both nodes and report back with its status.To check create a database on the master, it should show up on the slave.
Now that I have my master-slave up and running I decided to test the replication speed.I took a 40MB dump file and loaded it onto the master.It took 6 seconds to load.What surprised me was how long it took to complete the load on the slave.It took 43 seconds for the slave to get the complete dump, that was 43 seconds after the master load completed.To eliminate the network as a bottle neck I scp’d the file from master to slave and that took 6 seconds.My guess is the process on the master that reformates the statements with global id’s is really inefficient.
I still haven’t figured out how to automatically failover( good luck looking in the docs), but I did figure out how to manually promote the slave to master.First thing you need to do it take replication offline, that doesn’t mean stopping the daemon but just pausing the replication process:
/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel offline
This needs to be run on both hosts.
To verify that both hosts have replication offline run the following:
Now your ready, in my environment 18.104.22.168 is the current master and 22.214.171.124 the slave( you can see this in the output of the status command).The first step is to change the master to a slave:
/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel setrole -role slave -uri thl://126.96.36.199/
run that on the current master
Then turn the old slave on to a master:/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel setrole -role master -uri thl://188.8.131.52/Run this on the current slave.
Now check your work with the status command.When it looks good run this command on both hosts:
/opt/tungsten/tungsten/tungsten-replicator/bin/trepctl -service gabriel online
Again check your work with the status command.
To run commands on both hosts is easy as in the host setup process you copied over ssh keys and authorized_keys files.I just used a bash for loop:
for H in 184.108.40.206 220.127.116.11; do ssh root@$H "/root/tungsten-replicator-2.0.5/tungsten-replicator/bin/trepctl status"; done
Given the replication lag, poor documentation, long setup, and the fact that its really early I consider tungsten-replicator to have a lot of potential but mostly a science experiment for now.Of course this is with the free version.Like I said in the beginning I know how much it sucks when a master fails of when replication breaks.Tungsten-replicator has all of the features needed to solve all of these problems it just doesn’t give me enough confidence to run this in my production environment…..yet