Re-activated my performance monitoring with Rackspace

In my last blog post about my Rackspace monitoring solution I described why I deactivated the performance tracking: I measured the wire and not my blogs performance.

I was totally surprised that my post got an answer to that problem in three comments one and two days later. And those comments came from Rackspace employees. I never thought about contacting them about this. First of all, this was not on my high priority list and second, since I’m only using the cloud monitoring stuff for a dollar and a half per month, I never thought about bothering them with a request about that. I wanted to dig into it myself later, when time was right for it.

I knew that Rackspace is advertising with ‘fanatical’ support – but I never would have thought that they would scan random blogs for potential issues and support there. I am totally suprised and I think I already became a fan.

Now, the idea from Justin was to set the consistency level of the check from the default value QUORUM to ALL. In my case this changed the alerting behaviour from “Oh, two of the three zones are slow, let’s warn him.” to, “Well, both US zones are slow, but the EU zone London still performs good, let’s wait with an alert until that gets slow too”.

The other idea was to consecutiveCount to a larger value, so that an alert is only sent if the check fails for X consecutive times. I have this in mind, but for now the consistency level is totally enough for me to re-activate the check. So this is my new performance check I activated a day later:

:set consistencyLevel=ALL

if (metric['duration'] > 2500) {
  return CRITICAL, "HTTP request took more than 2.5 seconds, it took #{duration} milliseconds."
} 

if (metric['duration'] > 1800) { 
  return WARNING, "HTTP request took more than 1.8 seconds, it took #{duration} milliseconds."
}

return OK, "Overall performance is okay: #{duration} milliseconds."

With this in place I now get alerted whenever London thinks my Blog is slow, and it gives me response times from all three zones in that alerting mail. This check is just running for two weeks now, and alerted me two times. Both times the alert went green the check after, so the consecutive count would have prevented those mailings, but I think this ‘spam’ rate is still okay and I don’t see the need to fix this.

Actually, I’m pretty happy about those mails once in a while, because I feel that my monitoring is in place and working.

So the bottom line is: Rackspace support really, really rocks. And the monitoring can be tweaked to meet my needs.

Author: Sebastian Gingter

Software Engineer at Thinktecture, Fulltime geek, loving father & husband, always learning.