A small update to my Rackspace Cloud Monitoring configuration

3 Replies to “A small update to my Rackspace Cloud Monitoring configuration”

Jordan Evans says:

2013-07-10 at 23:38

Check thresholds can be difficult — we constantly see network blips while monitoring Rackspace Cloud Monitoring itself! If you want some help tuning the performance (or other) checks feel free to reach out to me with an email to the address I entered — we would love to help!

Disclaimer: I work for Rackspace on the Cloud Monitoring Team.
Justin Gallardo says:

2013-07-11 at 01:59

As you’ve discovered, Cloud Monitoring doesn’t provide any way to alert on metrics from a single monitoring zone while the check is being run in multiple zones. There are a couple things you could try out though to keep the noise to signal ratio down.

Everything I mention is found on the alarm language reference if you are interested in reading more.

The first thing I would consider is setting the alarm consistency level. This defaults to QUORUM which requires that a majority of your monitoring zones agree on the state of the alarm. This is calculated with N / 2 + 1, where N is the number of monitoring zones configured on the check. You can set this to ALL, and it would require *every* monitoring zone to agree on the state. That way you’d only get alerted if London *and* the US monitoring zones detect a slow HTTP response. Be sure to read about the pros and cons of the various consistency levels in the reference guide I mentioned above. See an example of this below:
:set consecutiveLevel=ALL if (metric['duration'] > 20000) { return new AlarmStatus(CRITICAL, "Things are slow!"); } return new AlarmStatus(OK, "Things are good");

The other thing you can think about is setting the ‘consecutive count’ on the alarm. This requires the state to be evaluated x times consecutively before the alarm is triggered. The example below would require a QUORUM of monitoring zones to evaluate the same alarm status in 3 consecutive polling windows.
:set consecutiveCount=3 if (metric['duration'] > 20000) { return new AlarmStatus(CRITICAL, "Things are slow!"); } return new AlarmStatus(OK, "Things are good");

If you have any other questions, please don’t hesitate to email me(justin.gallardo at rackspace.com).

Happy monitoring!
1. Justin Gallardo says:
  
  2013-07-11 at 02:09
  
  Doh, I just noticed a typo. The first alarm example should look more like:
  :set consistencyLevel=ALL if (metric['duration'] > 20000) { return new AlarmStatus(CRITICAL, "Things are slow!"); } return new AlarmStatus(OK, "Things are good");
  
  I had swapped ‘consistencyLevel’ for ‘consecutiveLevel’.
  
  Cheers!

Comments are closed.

Share this:

Related

3 Replies to “A small update to my Rackspace Cloud Monitoring configuration”