Tag-Archive for » monitoring «

Monday, September 27th, 2010 | Author:

eth0Basing more and more stuff on mcollective means relying more and more on one of its underlying components : the activeMQ middleware, and more precisely the stomp connector. I hit a weird bug a few days ago and realized that I was not functionnaly monitoring this part of the system. The port was bound and responded to connections, subscriptions were possible but messages didn’t pass through.So I wrote this little plugin that makes this possible : it creates a random string, sends it to a queue and then reads the queue to check if the result is the same.

This has been possible with the help of @ripienaar. Thanks for the explanation for the difference between topics & queues !

Category: Général  | Tags: , , ,  | Comments off
Thursday, May 20th, 2010 | Author:

eth0Readers of this little blog may know I’ve spent some time to have a munin setup that was tweaked to be optimized. But I’ve reached a point where my knowledge did not suffice to balance the design problems of munin. This is not a rant, I just say that when your infrastructure reaches a certain size munin reaches its limits. The pull based model, the graph generation (don’t tell me about the CGI graph, this thing never worked as expected) overloaded my management box. Talking with other peoples brought collectd to my attention so I gave it a try.

Collectd has many nice features : 10 seconds precision (versus 5 minutes), written in C for performance, multicast support (even if I don’t use it), and you can even create collectd relays. There are some packages for many different targets, even for my OpenWRT based access points ! Configuration can easily be automated (flat files, superior), a mandatory point to me.

Of course collectd comes without an UI but that’s no big deal : there are many around. As I said in the previous post, I use visage.

For the load generated a picture says a thousand words (click to enlarge) :

goodbyemunin

Next step is working on the IO congestion caused by so many RRD updates, collectd wiki has many tips about this, this will probably be fixed in a couple of hours.