We use the Mongo NoSQL database for some of our projects and normally deploy it as a replication set for redundancy. In a replication cluster the secondary nodes can begin to lag behind the primary node depending on the workload. We wanted a quick way of checking the current replication lag on a setup.

There are two Mongo shell commands that are useful here. The first is db.printSlaveReplicationInfo() which returns an output like:

MongoDB shell version: 2.4.10
connecting to: test
source:   000.000.000.000:27017
	 syncedTo: Thu May 08 2014 10:20:41 GMT+0000 (UTC)
		 = 1519 secs ago (0.42hrs)
source:   000.000.000.000:27017
	 syncedTo: Thu May 08 2014 10:20:41 GMT+0000 (UTC)
		 = 1519 secs ago (0.42hrs)

That gives us a date and time for each secondary in the replication set when the last command was synchronised. So in my example above nothing has been synchronised for 1519 seconds. Now depending on your application that could be something to worry about. If the last action was that long ago then we’re all up to date, if the last action was ten minutes ago then we’ve got some serious lag to worry about.

So really we need to know when the last event in the oplog occurred. We can do that with the db.printReplicationInfo() mongo shell commmand which returns an output like:

MongoDB shell version: 2.4.10
connecting to: test
configured oplog size:   46479.3125MB
log length start to end: 6890143secs (1913.93hrs)
oplog first event time:  Mon Feb 17 2014 16:24:58 GMT+0000 (UTC)
oplog last event time:   Thu May 08 2014 10:20:41 GMT+0000 (UTC)
now:                     Thu May 08 2014 10:41:30 GMT+0000 (UTC)

So that tells us the date and time of the last oplog command. We can compare the dates and times and see that we don’t have any lag in this example (both date/times are Thu May 08 2014 10:20:41 GMT+0000). All very nice, but a bit of a hassle to do all of that every time.

So I wrote a very quick shell script to run those commands and work out the lag in seconds. Given I was only really interested in the worst lag I only actually output that from the script. So the script itself:

#!/bin/sh
#
# Simple script (FreeBSD only because of date command) to check the replication lag
# on a mongo replication cluster running on the current machine and to report the
# maximum lag (ie node with the most lag) as the exit code of the script.  So with
# zero lag the exit code will be 0, 5 seconds of lag the exit code will be 5, etc
#
last_oplog_date=`/usr/local/bin/mongo --eval "db.printReplicationInfo()" | grep 'oplog last event time' | awk 'BEGIN { FS = "time:" } ; { print $2 }'`
last_oplog_timestamp=`date -j -f ' %a %b %d %Y %H:%M:%S %Z%z (%Z)' "$last_oplog_date" +%s`
lag=0
IFS="
"
for slave_replication_date in `/usr/local/bin/mongo --eval "db.printSlaveReplicationInfo()" | grep 'syncedTo' | awk 'BEGIN { FS = "syncedTo:" } ; { print $2 }'`
do
  slave_replication_timestamp=`date -j -f ' %a %b %d %Y %H:%M:%S %Z%z (%Z)' "$slave_replication_date" +%s`
  lag_seconds=`expr $last_oplog_timestamp - $slave_replication_timestamp`
  if [ $lag_seconds -gt $lag ]; then
    lag=$lag_seconds
  fi
done
exit $lag

Now the first thing to note is the date time parsing is fairly fragile. Converting a date like Thu May 08 2014 10:20:41 GMT+0000 to a UNIX timestamp varies from OS to OS, so if you’re converting this for a system other than FreeBSD check your man date to see what switches to use.

So this script starts by grabbing the last oplog event date/time and converting it into a UNIX timestamp (seconds). It then iterates over the last synchronised date/times for each secondary server converting those date/times to UNIX timestamps. We then simply subtract one from the other to get the seconds of lag and if this is higher than the last iteration we store that. Finally we exit and use the seconds of lag as the exit code.

So I can use this script and just test the exit code to see if I have any replication lag in my system. Given the various primary and secondary mongo instances in my replication sets tend to be connected by fast networks I wouldn’t expect much lag, so this works well for me. If you were expecting some seconds of lag to be normal then either modify in a threshold for the output, or perhaps just echo the seconds back to stdout.