centralized custom logging with rsyslog

There are a lot more robust centralized logging solutions out there but during a recent hack day I had about an hour to get logs from 24 servers to a centralized one for processing.

First you will need to get centralized logging set up, I won’t go into that here as that step is well documented but a quick view is to:

1. Configure the server to listen on udp.

2. add this line to the bottom of the rsyslog.conf file

*.* @$IP_ADDRESS:514

Okay so your servers are sending its system logs to a centralized host, now the fun part.

First you will need to create a config file for the log and place it in the /etc/rsyslog.d directory.

In my case I am going to ship a log named event.log

$InputFileName /srv/whi/shared/log/event.log
$InputFileTag eventlog
$InputFileStateFile eventlog
$InputRunFileMonitor
$InputFilePersistStateInterval 10
*.* @$IP__ADDRESS:514

Now add a config to the server to write the log to its own file.

$template ProxiesTemplate,"%msg%\n"
if $programname == 'eventlog' and $msg contains 'viewed.entr'  then /var/log/eventlog.log;ProxiesTemplate

In the first line above I am stripping the log or the timestamp and the hostname, I am only interested in the body .
The second line I am matching the program name and a particular string in the body of the message and writing them to a specific log file.

The last thing you need to do it put in a an exception to the current messages and syslog file otherwise these custom logs will also end up there.

In ubuntu I had to edit the file /etc/rsyslog.d/50-default.conf

I basically had to add this string “event.none” to the lines for syslog and messages

*.*;auth,authpriv.none,event.none    -/var/log/syslog

*.=info;*.=notice;*.=warn;auth,authpriv.none;cron,daemon.none;mail,news.none,event.none    -/var/log/messages

Thats basically it, enjoy.

if you attempt to run sudo pkill you might go insane

<rant>
Lately sidekiq has been leaving a lot of processes around in the stopping state.
My coworker asked me if there as a command to kill them all.
Of course with linux there are a lot of tools to perform this.
In this case however we had to kill the process based on the long listing of it.
ie instead of killall ruby we had to match an argument to the process
example:

sidekiq 2.17.3 whi [0 of 6 busy] stopping

The key was matching “stopping”, which would leave the other sidekiq processes running.

I used:

pkill -f stopping

and this worked perfect the first time.
I go to another box and it won’t work.
The command sudo -f stopping  does nothing, no error but the processes don’t die.
I upgrade the package, read the man page, search the internet.
Still nothing.
Am I going insane? Did I forget everything I know about linux?

Then I become root and run the command…….it works.
The difference was sudo vs. being root.
Of course when you sudo the command there is no warning and the man page doesn’t contain the word sudo.

So word to the wise, when in doubt become root!

</rant>

custom ohai plugins for bonded interfaces

We use bonded interfaces on our production hardware.
But only on production hardware, staging and dev just use the ethX interfaces.
So we needed a way for chef to identify the public & private interfaces regardless of whether they are bonded or not.

To start with I pulled down the ohai cookbook and added a few scripts to the plugins directory.
Thats all there is.
The two plugins below identify the public and private interfaces as either being eth0 || bond0 & eth1 || bond1

private_interface.rb

 provides "private_interface"
cmd = '/sbin/ifconfig bond0'
system(cmd)
if $? == 0
  private_interface "bond0"
else
  private_interface "eth0"
end

public_interface.rb

provides "public_interface"
cmd = '/sbin/ifconfig bond1'
system(cmd)
if $? == 0
  public_interface "bond1"
else
  public_interface "eth1"
end

From the chef ui you can see that public_interface and private_interface are now listed on the top level for a node.
screenshot

This allows me to specify in a template/recipe to use the public or private interface, ohai automatically discovers what the interface actually is.
Example from a recipe for ufw:

firewall_rule "http-internal" do
        port 8098
        action :allow
        interface node['private_interface']
        notifies :enable, "firewall[ufw]"
end

automated MySQL query reports

Back in the day I had automated query reports for MySQL using a perl library.
This worked okay but it only reported on the slow queries and also I would have to install a bunch of icky perl stuff.

Percona’s pt-query-digest is a much better tool and when you combine it with tcpdump you get an analysis of all your queries not just the slow ones.

When writing this script I had to solve two problems.

1. run tcpdump for a specific amount of time

I was prepared to write a loop with a sleep statement and then figure out how to kill tcpdump but I didn’t need to.
Instead I just used timeout which was already installed on ubuntu.

2. How to email the resulting report as an attachment.

When sending emails I usually just use mail but I couldn’t figure out how send an attachment.
Instead I found mutt.

BTW for an extra challenge I decided to write this in bash, loops in bash are really ugly for the record.

The script:

#!/bin/bash
hn=`hostname`
# mutt won't send the mail from the command line without prompting you 
# for the bodies content. To work around I am using a empty file as the body.
touch /tmp/blank
queries=( SELECT INSERT UPDATE )
for i in "${queries[@]}"
do
   :
# To clean up the tcpdump you have to include the pipe to sed
        /usr/bin/timeout 180 /usr/sbin/tcpdump -s0 -A -i bond0 dst port 3306 | /usr/bin/strings | /bin/grep $i | /bin/sed 's/^.*$i/$i/' &gt; /tmp/$i.log
        /usr/bin/pt-query-digest --type rawlog /tmp/$i.log &gt; /tmp/$i.txt
        /usr/bin/mutt -s "$i report from $hn" xxxx@weheartit.com  -a /tmp/$i.txt &lt;/tmp/blank
done

Speed up your backups by compressing and posting in parallel

Its pretty typical to have data stores that are several hundreds of GB’s in size and need to be posted offsite.

At weheartit our database is ~1/2 TB uncompressed and the old method of compressing and posting to S3 took 9 hours and rarely completed.

I was able to speed up this process and now it completes in < 1 hour.
53 minutes in fact.

For compression I used pbzip2 instead of gzip.

This is how I am using it along with percona’s xtrabackup.

innobackupex --user root --password $PASSWORD --slave-info --safe-slave-backup --stream=tar ./ | pbzip2 -f -p40 -c  > $BACKUPDIR/$FILENAME

The backup and compression only takes 32 minutes and compresses it from 432GB to 180GB

Next comes speeding up the transfer to S3.

In November of 2010 amazon added this feature to S3 but for some reason this functionality hasn’t been added to s3cmd.
Instead I am using s3 multipart upload
Thanks David Arther!

This is how I am using it.

/usr/local/bin/s3-mp-upload.py --num-processes 40 -s 250 $BACKUPDIR/$FILENAME $BUCKET/$(hostname)/$TIME/$FILENAME

It only takes 20 minutes to copy 180GB over the internet!
That is crazy fast.
In both cases you can play around with the number of threads for both pbzip2 and s3 multi part upload, the threads I use work for me but that depends on the size of your system.

mysql multi threaded slaves (mts) slower than single threaded

I work @ weheartit.com where we rely on MySQL.
I’ve seen very little published about mts and nothing from outside a lab so I decided to test it out.
The results weren’t good.

Our main database group has 4 active schema, is running 5.6.12 and when a slave gets our of sync its takes a while to catch back up to the master.

One of the most interesting features for MySQL 5.6 is multi threaded slaves.

Without this feature the sync speed is limited to a single thread running on a single core.

Before I start let me clear up this point about mts which is that this feature will only help if you are running more than 1 schema per host as each thread can only process one schema at a time.

That being said I went and upgraded one of my slaves to 5.6.12, restored an xtrabackup to it.

Then I added the following lines to the my.cnf and ran start slave.


binlog-format=STATEMENT
 slave_parallel_workers = 4
 master_info_repository = TABLE
 relay_log_info_repository = TABLE

Now I can just run show slave status\G and watch it catching up.

However once it was caught up I stopped replication on the mts host and single thread slave for 20 minutes.
Then I started the slaves and it turns out that the single threaded slave caught up faster.
What?
to eliminate disk and RAID configs( they were the same as I could tell) this next time I only stopped the sql_thread for 20 minutes.
Same results, the slave running mts is actually slower.

Looks like there is reason when you search for this topic the only posts are from people using it in a lab is because although it appears to function it doesn’t delivery what ultimately need to which is faster replication syncs.

I’ll keep watching the mysql releases and hope this gets fixed soon.

Benchmark: Rackspace’s block storage SATA vs. SSD vs. VM disk

Rackspace offers 2 types of block storage:

Standard (SATA) @ $015/GBandHigh-Performance (SSD) @ $0.70/GB

Seeing that the SSD storage is 4x the cost of SATA I decided to see if the performance is also 4x.

Lets see.

The setup:

An 8GB(RAM) system running ubuntu 12.04&2 100GB volumes with the xfs file system mounted with the following options:

/dev/xvdb   /fast     xfs noatime,nodiratime,allocsize=512m   0   0  /dev/xvdd   /slow     xfs noatime,nodiratime,allocsize=512m   0   0

Basic test using dd:

I’ve benchmarked lots of storage systems in the past and I always like to start out with dd.I do this because it doesn’t take anytime to set up and should give you some idea of how it performs.

In this test I create a 20GB file on each mounted filesystem using the following command:

dd if=/dev/zero of=10GB.file bs=1M count=20k

The results are a little surprising:

Volume write performance:

standard            105 MB/shigh-performance        103 MB/sthe hosts's own volume      135 MB/s

Wow, not what are were hoping for.I ran this test several times and the “high-performance” storage was always the slowest.To quote Homer Simpson “Doh!!”

bonnie++

I ran bonnie with the following args, basically I specified double the amount of RAM for the test.

bonnie++ -s 16g

For sequential reads and writes they were about the same, this is expected as dd already showed this:

Volume                sequential reads            sequential writes  standard              95981/sec                   16564/sec  high-performance      95927/sec                   15633/sec  localVM               108463/sec                  1208/sec

The results now show where the high-performance excels which is random seeks.

Volume                random seeks  standard              473.4/sec  high-performance      1969/sec  localVM               578.6/sec

Conclusion

The question was:Does the 4x cost of high-performance storage perform 4x?

The answer is yes.

Nice job rackspace.

However, as with the sequential numbers from above it doesn’t always out perform standard or local disk. So before you decide to use the more expensive option benchmark your application on it.