Archive for the Category » Code «

Friday, March 11th, 2011 | Author:

eth0This post follows my previous one, dealing with the reuse of chef providers of chef in mcollective. In the comments Adam Jacob had an interesting word and when I wrote my second agent, to manage package I saw it would be a piece of cake to write a really generic agent, due to the nature of chef resource (and the way to invoke them)

So, this is a generic chef resource mcollective agent, with the associated example client code. It anyway deserves an little explanation; it is not mean to work with a command line invocation. Why ? Because I push quite “complex” data as the resourceactions parameter. The only way I found to make this work from command line is to use eval on the argument, which is no way acceptable. Anyway I hope some people will find this useful.

Category: BOFH Life, Code, SysAdmin, Tech  | Tags: , , ,  | Comments off
Tuesday, March 08th, 2011 | Author:

eth0It has been quite calm for a couple of months here. I have switched job, it explains why I had less time to post some things.I now work at fotolia, and I switched from puppet to chef (no troll intended, I still think puppet is a great tool, please read this).

However, a tool I still have is the awesome mcollective. Unfortunately, the most used agents (package, service) relay on puppet providers to do their actions. Fortunately, open source is here, so I wrote a (basic) service agent that uses chef providers to start/stop or restart an agent. It still needs some polish for the status part (ho the ugly hardcoded path) but I was quite excited to share this. Freshly pushed on github !

Thanks to Jordan Sissel for minstrel, an awesome debug tool, the opscode team for the help on the provider and R.I. Pienaar for mcollective (and the support).

Category: BOFH Life, Code, SysAdmin, Tech  | Tags: , , ,  | 3 Comments
Tuesday, September 14th, 2010 | Author:

eth0I recently switched from puppet daemon to mcollective commander : it kicks the “stuck in outer space puppet daemon” feature out of the way and brings me nice features (as load control). To do so I deployed the puppetd agent over my boxes.

As most of the sysadmins say : “if it’s not monitored, it doesn’t exist” I had placed a script in cron based on the puppetlast script to report by mail once a day which hosts had not checked in during the last 30 minutes. This method had 2 serious flaws : it runs only once a day (I hate being flooded by mails) and test machines keep nagging you until you remove the yaml files on the puppetmaster. Talking with Volcane on #mcollective I discovered that the agent was able to report when the client last run, so I decided to use this to check my puppet runs freshness with a nagios plugin.

Good bye cron job, say hello nagios.

Category: Code, NetAdmin, Puppet, SysAdmin  | Tags: , ,  | Comments off
Wednesday, September 08th, 2010 | Author:

eth0 Most people that work with puppet use a VCS : subversion, git, CVS, mercurial… Pick yours. My company uses subversion and each cmmit to the repository needs to be pulled by the master. Since I have two masters, I also want them to be synchronized. Once again it’s mcollective that comes to the rescue. I wrote a very simple agent (a 5 minutes work, to be improved) that can update a specified path. Grab it here. Once it is deployed you can use a post commit hook that calls it.

Example of mine :

#!/usr/local/bin/ruby
 
require 'mcollective'
include MCollective::RPC
 
mc = rpcclient("svnagent")
mc.progress = false
mc.class_filter "puppet::master"
mc.update(:path => "/etc/puppet")

The agent will only be called on machines being puppet masters by using the class filter.

Category: Code, Puppet, SysAdmin  | Tags: , ,  | 2 Comments
Wednesday, August 18th, 2010 | Author:

RubyQuick post : installing a mongo server is good, monitoring it is better. I wrote a very basic check that ensure your mongo DB is healthy. Grab it on my github.

Category: Code, SysAdmin, Tech  | Tags: , ,  | Comments off
Tuesday, May 18th, 2010 | Author:

eth0Some friends told me for a while about collectd, why I should look at it, why munin is so painful and so on. If you’ve been reading my posts you know I have tweaked a little my $WORK munin install to make it faster and lighter. But I finally took time to explore collectd, and I regret to not have done this before. It has so many pros that I decided to implement it in parallel with munin (because I can’t afford being blind on metrics). But collectd comes without an UI : it “only” collectds data, but that’s not a problem. There are various web interfaces and after giving a look to a bunch of them I fell in love with Lindsay Holmwood‘s Visage.

This piece of software is definitely cool : all graphs are rendered live in your browser in SVG. Yes ! Realtime graphs, no need for crappy flash s***, zoom. It is based on sinatra, haml and some JS libraries (I won’t talk about this, my JS foo is deeper than the Mariana Trench). But it lacked some features : it’s OK when you have a few hosts but when the hosts list starts being loooong then the interface needs some improvements. So I forked it on github and implemented (some parts of) what I needed. My github fork has host grouping & per host profiles. Check this out and enjoy Visage !

Now working on sets of graphs :)

PS : <3 Guigui2

Category: BOFH Life, Code, NetAdmin, SysAdmin, Tech  | Tags: , , ,  | Comments off
Wednesday, April 14th, 2010 | Author:

eth0I already blogged about my experiments with mcollective & xen but I had something a little bigger in my mind. A friend had sent me a video showing some vmware neat features (DRS mainly) with VMs migrating through hypervisors automatically.

So I wrote a “proof of concept” of what you can do with an awesome tool like mcollective. The setup of this funny game is the following :

  • 1 box used a iSCSI target that serves volumes to the world
  • 2 xen hypervisors (lenny packages) using open-iscsi iSCSI initiator to connect to the target. VMs are stored in LVM, nothing fancy

The 3 boxens are connected on a 100Mb network and the hypervisors have an additionnal gigabit network card with a crossover cable to link them (yes, this is a lab setup). You can find a live migration howto here.

For the mcollective part I used my Xen agent (slightly modified from the previous post to support migration), which is based on my xen gem. The client is the largest part of the work but it’s still less than 200 lines of code. It can (and will) be improved because all the config is hardcoded. It would also deserve a little DSL to be able to handle more “logic” than “if load is superior to foo” but as I said before, it’s a proof of concept.

Let’s see it in action :

hypervisor2:~# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   233     2     r-----    873.5
hypervisor3:~# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   232     2     r-----  78838.0
test1                                        6   256     1     -b----     18.4
test2                                        4   256     1     -b----     19.3
test3                                       20   256     1     r-----     11.9

test3 is a VM that is “artificially” loaded, as is the machine “hypervisor3” (to trigger migration)

[mordor:~] ./mc-xen-balancer
[+] hypervisor2 : 0.0 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.11 load and 3 slice(s) running
[+] added test1 on hypervisor3 with 0 CPU time (registered 18.4 as a reference)
[+] added test2 on hypervisor3 with 0 CPU time (registered 19.4 as a reference)
[+] added test3 on hypervisor3 with 0 CPU time (registered 18.3 as a reference)
[+] sleeping for 30 seconds

[+] hypervisor2 : 0.0 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.33 load and 3 slice(s) running
[+] updated test1 on hypervisor3 with 0.0 CPU time eaten (registered 18.4 as a reference)
[+] updated test2 on hypervisor3 with 0.0 CPU time eaten (registered 19.4 as a reference)
[+] updated test3 on hypervisor3 with 1.5 CPU time eaten (registered 19.8 as a reference)
[+] sleeping for 30 seconds

[+] hypervisor2 : 0.16 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.33 load and 3 slice(s) running
[+] updated test1 on hypervisor3 with 0.0 CPU time eaten (registered 18.4 as a reference)
[+] updated test2 on hypervisor3 with 0.0 CPU time eaten (registered 19.4 as a reference)
[+] updated test3 on hypervisor3 with 1.7 CPU time eaten (registered 21.5 as a reference)
[+] hypervisor3 has 3 threshold overload
[+] Time to see if we can migrate a VM from hypervisor3
[+] VM key : hypervisor3-test3
[+] Time consumed in a run (interval is 30s) : 1.7
[+] hypervisor2 is a candidate for being a host (step 1 : max VMs)
[+] hypervisor2 is a candidate for being a host (step 2 : max load)
trying to migrate test3 from hypervisor3 to hypervisor2 (10.0.0.2)
Successfully migrated test3 !

Let’s see our hypervisors :

hypervisor2:~# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   233     2     r-----    878.9
test3                                       25   256     1     -b----      1.1
hypervisor3:~# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0   232     2     r-----  79079.3
test1                                        6   256     1     -b----     18.4
test2                                        4   256     1     -b----     19.4

A little word about configuration options :

  • interval : the poll time in seconds.  this should not be too low, let the machine some time and avoid load peeks to distort the logic.
  • load_threshold : where you consider the machine load is too high and that it is time to move some stuff away (tampered with max_over, see below)
  • daemonize : not used yet
  • max_over : maximum time (in minutes) where load should be superior to the limit. When reached, it’s time, really. Don’t set it too low and at least 2*interval or sampling will not be efficient
  • debug : well….
  • max_vm_per_host : the maximum VMs a host can handle. If a host already hit this limit it will not be candidate for receiving a VM
  • max_load_candidate : same thing as above, but for the load
  • host_mapping : a simple CSV file to handle non-DNS destinations (typically my crossover cable address have no DNS entries)

What is left to do :

  • Add some barriers to avoid migration madness to let load go down after a migration or to avoid migrating a VM permanently
  • Add a DSL to insert some more logic
  • Write a real client, not a big fat loop

Enjoy the tool !

Files :

Tuesday, April 06th, 2010 | Author:

eth0Another cool project I keep an eye on for some weeks is “the marionette collective“, aka mcollective. This project is leaded & develloped by R.I. Pienaar, one of the most active people in the puppet world too.

Mcollective is an framework for distributed sysadmin. It relies on a messaging framework and has many features included : flexibility, speed, easy to understand.

Some time ago, I had wrote a tool called “whosyourdaddy” to help me (and my memory as big as a goldfish one) to find on which Xen dom0 a Xen domU was living. It worked fine, expect the fact that is was not dynamic : if a VM was migrated  from a dom0 to another, I had to update the CMDB. Not really reliable (if an update fails the CMDB is no more accurate) and I didn’t want to have to embed this constraint in the Xen logic. So I decided to try out to write my own mcollective agent and here it is ! It is built on top of a (very) small ruby module for xen and has it own client.

You can find on which dom0 a domU resides :

master1:~# ./mc-xen -a find --domu test
hypervisor2              : Absent
hypervisor1              : Absent
master1:~# ./mc-xen -a find --domu domu2
hypervisor2              : Present
hypervisor1              : Absent

Or list your domUs :

master1:~# ./mc-xen -a list
hypervisor2              
 domu2

hypervisor1              
 no domU running

Download the agent & the client

Tuesday, March 02nd, 2010 | Author:

kermitI’ve been lazy at maintaining my servers recently and decided to start playing with puppet reports. First I started with something simple that helps me to find on which machines my manifests have some failure.

So here’s a quick and dirty code that goes through Puppet’s reportdir and points out neglected machines.

#!/usr/bin/env ruby
 
require 'puppet'
require 'find'
require 'yaml'
require 'optparse'
 
Puppet[:config] = "/etc/puppet/puppet.conf"
Puppet.parse_config
 
def most_recent_file(path)
	reports = []
	Find.find(path) { |file|
		if File.file? file
			reports << File.basename(file,".yaml")
		end
	}
	reports.sort!.reverse!
	return path+"/"+reports[0].to_s+".yaml"
end
 
 
def scan_dir(path, debug=false)
	Find.find(path) { |entry|
		if entry != path # don't scan the basedir
			if File.directory? entry
				report = most_recent_file(entry)
				scan_file(report, debug)
			end
		end
	}
end
 
 
def scan_file(filename, debug=false)
	notify_on_field = [:failed]
 
	# debug
	if debug then  puts "scanning " + filename end
 
	fp=open(filename,"r")
	YAML::load_documents(fp) { |report|
		report.metrics["resources"].values.each { |value|
			if (notify_on_field.include? value[0]) and (value[2] > 0) then
				puts "#{report.host} has #{value[2]} #{value[0]} resource(s)"
				if debug then
					puts "log message(s) :"
					report.logs.each { |log|
						puts log.message
					}
				end
			end
		}
	}	
end
 
options = {}
myargs = Array
 
optparse = OptionParser.new { |opts|
	opts.banner = "Usage : report_check.rb"
 
	options[:show]=false
	opts.on("-d", "--debug", "runs in debug mode") do |debug|
		options[:debug]=true
	end
 
	opts.on("-h", "--help", "Displays this help") do
		puts opts
		exit
	end
 
}
 
optparse.parse!
 
scan_dir(Puppet[:reportdir], options[:debug])
Friday, February 26th, 2010 | Author:

kermitOn my Solaris machines at $WORK I use iMil‘s pkgin to install additional software. But until today, I add to do it by hand, on every machine… Not really what I like to do after a little more than a year using puppet. So I wrote a provider to manage packages with pkgin. It was very informative on puppet internals and I learned more about my favorite config management system.

Enough talking, here is the file : pkgin.rb

Example of use in a manifest :

class foo {
    package { "bla":
        ensure => installed,
        provider => pkgin
    }
}