Handling extant resources in Terraform

Terraform is a Hashicorp tool which embraces the Infrastructure as Code model to manage a variety of platforms and services in today’s modern, cloud-based Internet.  It’s still in development, but it already provides a wealth of useful functionality, notably with regards to Amazon and Digital Ocean interactions.  The one thing it doesn’t do, however, is manage pre-existing infrastructure very well.  In this blog post we’ll explore a way to integrate extant infra into a basic Terraform instance.

Note that this post is current as of Terraform v0.3.6.  Hashicorp has hinted that future versions of Terraform will handle this problem in a more graceful way, so be sure to check those changelogs regularly. 🙂

summary

A full example and walk-through will follow; however, for those familiar with Terraform and just looking for the tl;dr, I got you covered.

  • Declare a new, temporary resource in your Terraform plan that is nearly identical to the extant resource.
  • Apply the plan, thus instantiating the temporary “twinned” resource and building a state file.
  • Alter the appropriate id fields to be the same as the extant resource in both the state and config files.
  • Perform a refresh which will populate the state file with the correct data for the declared extant resource.
  • Remove the temporary resource from AWS manually.
  • Voilà.

faster and more dangerous, please.

Walking through the process and meticulously checking every step? Ain’t nobody got time for that!

  • Edit the state file and insert the resource directly – it’s just JSON, after all.

examples

In the examples below, the notation [...] is used to indicate truncated output or data.

Also note that the AWS cli tool is assumed to be configured and functional.

S3

The extant resource in this case is an S3 bucket called phrawzty-tftest-1422290325. This resource is unknown to Terraform.

$ aws s3 ls | grep tftest
2015-01-26 17:39:07 phrawzty-tftest-1422290325

Declare the temporary twin in the Terraform config:

resource "aws_s3_bucket" "phrawzty-tftest" {
    bucket = "phrawzty-tftest-1422353583"
}

Verify and prepare the plan:

$ terraform plan -out=terratest.plan
    [...]
Path: terratest.plan

+ aws_s3_bucket.phrawzty-tftest
    acl:    "" => "private"
    bucket: "" => "phrawzty-tftest-1422353583"

Apply the plan (this will create the twin):

$ terraform apply ./terratest.plan
    [...]
aws_s3_bucket.phrawzty-tftest: Creation complete

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
    [...]
State path: terraform.tfstate

Verify that the both the extant and temporary resources exist:

$ aws s3 ls | grep phrawzty-tftest
2015-01-26 17:39:07 phrawzty-tftest-1422290325
2015-01-27 11:14:09 phrawzty-tftest-1422353583

Verify that Terraform is aware of the temporary resource:

$ terraform show
aws_s3_bucket.phrawzty-tftest:
  id = phrawzty-tftest-1422353583
  acl = private
  bucket = phrawzty-tftest-1422353583

Alter the config file:

  • Insert the name of the extant resource in place of the temporary.
  • Strictly speaking this is not necessary, but it helps to keep things tidy.
resource "aws_s3_bucket" "phrawzty-tftest" {
    bucket = "phrawzty-tftest-1422290325"
}

Alter the state file:

  • Insert the name (id) of the extant resource in place of the temporary.
            "resources": {
                "aws_s3_bucket.phrawzty-tftest": {
                    "type": "aws_s3_bucket",
                    "primary": {
                        "id": "phrawzty-tftest-1422290325",
                        "attributes": {
                            "acl": "private",
                            "bucket": "phrawzty-tftest-1422290325",
                            "id": "phrawzty-tftest-1422290325"
                        }
                    }
                }
            }

Refresh the Terraform state (note the ID):

$ terraform refresh
aws_s3_bucket.phrawzty-tftest: Refreshing state... (ID: phrawzty-tftest-1422290325)

Verify that Terraform is satisfied with the state:

terraform plan
Refreshing Terraform state prior to plan...

aws_s3_bucket.phrawzty-tftest: Refreshing state... (ID: phrawzty-tftest-1422290325)

No changes. Infrastructure is up-to-date. This means that Terraform
could not detect any differences between your configuration and
the real physical resources that exist. As a result, Terraform
doesn't need to do anything.

Remove the temporary resource:

$ aws s3 rb s3://phrawzty-tftest-1422353583/
remove_bucket: s3://phrawzty-tftest-1422353583/

S3, faster.

For the sake of this example, the state file already contains an S3 resource called phrawzty-tftest-blah.

Add the “extant” resource directly to the state file.

            "resources": {
                [...]
                },
                "aws_s3_bucket.phrawzty-tftest": {
                    "type": "aws_s3_bucket",
                    "primary": {
                        "id": "phrawzty-tftest-1422290325",
                        "attributes": {
                            "acl": "private",
                            "bucket": "phrawzty-tftest-1422290325",
                            "id": "phrawzty-tftest-1422290325"
                        }
                    }
                }

Refresh:

$ terraform refresh
aws_s3_bucket.phrawzty-tftest: Refreshing state... (ID: phrawzty-tftest-1422290325)
aws_s3_bucket.phrawzty-tftest-blah: Refreshing state... (ID: phrawzty-tftest-blah)

Verify:

$ terraform show
aws_s3_bucket.phrawzty-tftest:
  id = phrawzty-tftest-1422290325
  acl = private
  bucket = phrawzty-tftest-1422290325
aws_s3_bucket.phrawzty-tftest-blah:
  id = phrawzty-tftest-blah
  acl = private
  bucket = phrawzty-tftest-blah

That’s that.

HAProxy Puppet module (phrawzty remix)

As part of a big Logstash project at Mozilla (more on that to come), I was looking for an HAProxy module for Puppet, stumbling across the official Puppetlabs module in the process.  I’m told that this module works fairly well, with the caveat that it sometimes outputs poorly-formatted configuration files (due to a manifestly buggy implementation of concat).  Furthermore, the module more or less requires storeconfigs, which we do not use in our primary Puppet system.

Long story short, while I never ended up using HAProxy as part of the project, I did remix the official module to solve both of the aforementioned issues.  From the README :

This module is based on Puppetlabs’ official HAProxy module; however, it has been “remixed” for use at Mozilla. There are two major areas where the original module has been changed :

  • Storeconfigs, while present, are no longer required.
  • The “listen” stanza format has been abandoned in favour of a frontend / backend style configuration.

A very simple configuration to proxy unrelated Redis nodes :

  class { 'haproxy': }

  haproxy::frontend { 'in_redis':
    ipaddress       => $::ipaddress,
    ports           => '6379',
    default_backend => 'out_redis',
    options         => { 'balance' => 'roundrobin' }
  }

  haproxy::backend { 'out_redis':
    listening_service => 'redis',
    server_names      => ['node01', 'node02'],
    ipaddresses       => ['node01.redis.server.foo', 'node02.redis.server.foo'],
    ports             => '6379',
    options           => 'check'
  }

If that sounds interesting to you, the module is available on my puppetlabs-haproxy repo on Github. Pull requests welcome !

quickly generate an encrypted password

Hi everybody !  Here’s a quick method for generating encrypted passwords that are suitable for things like /etc/passwd .  I realise that this isn’t terribly complex, but honestly, I always forget how to do this until I actually need to do it – so here’s a reminder for all of us. 🙂

#!/bin/bash

if [ "x$1" == 'x' ]; then
echo "USAGE: $0 'password'"
exit 1
fi

# Get an md5sum of the password string; this is used for the SHA seed.
md5=$( echo $1 | md5sum )
extract="${md5:2:8}"

# Calculate the SHA hash of the password string using the extracted seed.
mkpasswd -m SHA-512 "$1" "$extract"
exit $?

Elasticsearch backup strategies

Update: This is an old blog post and is no longer relevant as of version 1.x of Elasticsearch. Now we can just use the snapshot feature.

Hello again! Today we’re going to talk about backup strategies for Elasticsearch. One popular way to make backups of ES requires the use of separate ES node, while another relies entirely on the underlying file system of a given set of ES nodes.

The ES-based approach:

  • Bring up an independent (receiving) ES node on a machine that has network access to the actual ES cluster.
  • Trigger a script to perform a full index import from the ES cluster to the receiving node.
  • Since the receiving node is unique, every shard will be represented on said node.
  • Shutdown the receiving node.
  • Preserve the /data/ directory from the receiving node.

The file system-based approach:

  • Identify a quorum of nodes in the ES cluster.
  • Quorum is necessary in order to ensure that all of the shards are represented.
  • Trigger a script that will preserve the /data/ directory of each selected node.

At first glance the file system-based approach appears simpler – and it is – but it comes with some drawbacks, notably the fact that coherency is impossible to guarantee due to the amount of time required to preserve /data/ on each node. In other words, if data changes on node between the start and end times of the preservation mechanism, those changes may or may not be backed up. Furthermore, from an operational perspective, restoring nodes from individual shards may be problematic.

The ES-based approach does not have the coherency problem; however, beyond the fact that it is more complex to implement and maintain, it is also more costly in terms of service delivery. The actual import process itself requires a large number of requests to be made to the cluster, and the resulting resource consumption on both the cluster nodes as well as the receiving node are non-trivial. On the other hand, having a single, coherent representation of every shard in one place may pay dividends during a restoration scenario.

As is often the case, there is no one solution that is going to work for everybody all of the time – different environments have different needs, which call for different answers.  That said, if your primary goal is a consistent, coherent, and complete backup that can be easily restored when necessary (and overhead be damned!), then the ES-based approach is clearly the superior of the two.

import it !

Regarding the ES-based approach, it may be helpful to take a look at a simple import script as an example.  How about a quick and dirty Perl script (straight from the docs) ?

use ElasticSearch;

my $local = ElasticSearch->new(
    servers => 'localhost:9200'
);
my $remote = ElasticSearch->new(
    servers    => 'cluster_member:9200',
    no_refresh => 1
);

my $source = $remote->scrolled_search(
    index => 'content',
    search_type => 'scan',
    scroll      => '5m'
);
$local->reindex(source=>$source);

You’ll want to replace the relevant elements with something sane for your environment, of course.

As for preserving the resulting /data/ directory (in either method), I will leave that as an exercise to the reader, since there are simply too many equally relevant ways to go about it.  It’s worth noting that the import method doesn’t need to be complex at all – in fact, it really shouldn’t be, since complex backup schemes tend to have too many chances for failure than is necessary.

Happy indexing!

Nagios plugin to parse JSON from an HTTP response

Update 2015-10-07: This plugin has evolved – please check the latest README for up to date details.

 

Hello all !  I wrote a plugin for Nagios that will parse JSON from an HTTP response.  If that sounds interesting to you, feel free to check out my check_http_json repo on Github.  The plugin has been tested with Ruby 1.8.7 and 1.9.3.  Pull requests welcome !

Usage: ./check_http_json.rb -u  -e  -w  -c 
-h, --help                       Help info.
-v, --verbose                    Additional human output.
-u, --uri URI                    Target URI. Incompatible with -f.
    --user USERNAME              HTTP basic authentication username.
    --pass PASSWORD              HTTP basic authentication password.
-f, --file PATH                  Target file. Incompatible with -u.
-e, --element ELEMENT            Desired element (ex. foo=>bar=>ish is foo.bar.ish).
-E, --element_regex REGEX        Desired element expressed as regular expression.
-d, --delimiter CHARACTER        Element delimiter (default is period).
-w, --warn VALUE                 Warning threshold (integer).
-c, --crit VALUE                 Critical threshold (integer).
-r, --result STRING              Expected string result. No need for -w or -c.
-R, --result_regex REGEX         Expected string result expressed as regular expression. No need for -w or -c.
-W, --result_warn STRING         Warning if element is [string]. -C is required.
-C, --result_crit STRING         Critical if element is [string]. -W is required.
-t, --timeout SECONDS            Wait before HTTP timeout.

The --warn and --crit arguments conform to the Nagios threshold format guidelines.

If a simple result of either string or regular expression (-r or -R) is specified :

  • A match is OK and anything else is CRIT.
  • The warn / crit thresholds will be ignored.

If the warn and crit results (-W and -C) are specified :

  • A match is WARN or CRIT and anything else is OK.
  • The warn / crit thresholds will be ignored.

Note that (-r or -R) and (-W and -C) are mutually exclusive.

Note also that the response must be pure JSON. Bad things happen if this isn’t the case.

How you choose to implement the plugin is, of course, up to you.  Here’s one suggestion:

# check json from http
define command{
 command_name    check_http_json-string
 command_line    /etc/nagios3/plugins/check_http_json.rb -u 'http://$HOSTNAME$:$ARG1$/$ARG2$' -e '$ARG3$' -r '$ARG4$'
}
define command{
 command_name    check_http_json-int
 command_line    /etc/nagios3/plugins/check_http_json.rb -u 'http://$HOSTNAME$:$ARG1$/$ARG2$' -e '$ARG3$' -w '$ARG4$' -c '$ARG5$'
}

# make use of http json check
define service{
 service_description     elasticsearch-cluster-status
 check_command           check_http_json-string!9200!_cluster/health!status!green
}
define service{
 service_description     elasticsearch-cluster-nodes
 check_command           check_http_json-int!9200!_cluster/health!number_of_nodes!4:!3:
}

RabbitMQ plugin for Collectd

Hello all,

I wrote a rudimentary RabbitMQ plugin for Collectd.  If that sounds interesting to you, feel free to take a look at my GitHub.  The plugin itself is written in Python and makes use of the Python plugin for Collectd.

It will accept four options from the Collectd plugin configuration :

Locations of binaries :

RmqcBin = /usr/sbin/rabbitmqctl
PmapBin = /usr/bin/pmap
PidofBin = /bin/pidof

Logging :

Verbose = false

It will attempt to gather the following information :

From « rabbitmqctl list_queues » :

messages
memory
consumser

From « pmap » of « beam.smp » :

memory mapped
memory writeable/private (used)
memory shared

Props to Garret Heaton for inspiration and conceptual guidance from his « redis-collectd-plugin ».

CPAN RPMs in RHEL / CentOS : generation, conflict, and solutions

Hello all !  Today we’re going to take a look at a somewhat obscure problem that – once encountered – can cause nothing but headaches for a system administrator.  The problem relates to conflicts in CPAN RPM packages, and what can be done to work around the issue.  If you’ve made it this far, i’m going to assume a couple of things : you’re comfortable with RPMs and repositories, have worked with a .spec file before, and you know what Perl modules are.  Good ?  Ok, let’s go.

Edit : About a week after i posted this article, the pastebin i uploaded the examples to disappeared.  Maybe it will come back – i don’t know – but if not, sorry for the broken links…

CPAN is an enormous collection of Perl modules.  If you’ve ever written a Perl script, there’s a good chance you’ve used a module that – at one point or another – came from this archive.  One of the really neat features of CPAN is the interactive manner in which modules can be downloaded and installed from the archive using Perl right from the command line (frankly, if you’re reading this post, there’s a good chance you’ve used this feature, too).  This is a fairly common way to install new modules and add functionality to your system, especially if you’re coding for local use (i.e. on your personal box).

It’s useful, but it’s not perfect, and one of the key areas where it starts to fail is scalability : if you’ve got a bunch of machines, and you need to SSH into each one to interactively install a CPAN module or two, it’s going to be a hassle.  Likewise, CPAN doesn’t often find its way into the hearts and minds of enterprise Red Hat or CentOS environments, where the official policy is often to install software via RPM only (for support, administration, and sanity reasons, this is often the case).

Luckily, some of the most commonly used CPAN modules exist as RPMs in the default repositories.  Some, but not all (and not even « many ») – for this, there are other repositories available.  Some examples :

That last one – Magnum – is particularly interesting given the subject of our post today.  From their info page :

At Magnum we have a firm rule that all CPAN modules on our machines are installed from RPMs. The Fedora and Centos projects build RPMs for many CPAN modules, but there are always ones missing and the ones that are available often lag behind the most up to date versions.  For that reason, we build a lot of RPMs of CPAN modules. And we don’t want to keep that work to ourselves, so on these pages we make them available for anyone to download.

Their RPMs are generated automagically using a great tool called « cpanspec », which does exactly what you think it does : given a CPAN tarball, it will generate a .spec file suitable for building an installable RPM.  It is available in the standard repositories, and can be installed easily via YUM as normal, so go ahead and do that now.  Ok, example time : say you needed HTML::Laundry, but after a quick peek through your repositories, it becomes readily apparent that an RPM is not available.  Thanks to cpanspec, all is not lost :

[build@host-119 ~]$ wget http://search.cpan.org/CPAN/authors/id/S/ST/STEVECOOK/HTML-Laundry-0.0103.tar.gz
[build@host-119 ~]$ cpanspec --packager "build <build@domain.ext>" HTML-Laundry-0.0103.tar.gz

We just downloaded the tarball right from the CPAN website, and ran cpanspec against it.  The « –packager » argument simple defines the person who’s generating the .spec, and doesn’t necessarily have to be anything accurate.  Go ahead and try it for yourself.  Now take a look at the resulting .spec file (or on the a pastebin here).  As you can see, it fills in all the fields, including the critical (and often tricky-to-determine) « BuildRequires » and « Requires » items.  Frankly, it’s solid gold, and it has made the lives of CentOS / RHEL admins all over the world much easier.

That said, it’s not perfect, and there are times when you might run into problems.  Actually, you may run into two problems in particular.  The first is conflicts over ownership, which arises when multiple RPMs claim to be responsible for the same file (or files, or directories, or features, or whatever).  The second is more nefarious : an RPM that writes files to the system without declaring ownership for them – a condition often referred to as « clobbering ».  The former is irritating, but at least it’s not destructive, unlike the latter, which can cause all manner of headaches.  To illustrate these two problems, let’s take a look at another example (this one being decidedly more real-world than that of Laundry above) : CGI.pm.

The .spec file that is generated from this tarball is functional and correct, and we can build an installable RPM out of it, so at first all appears well.  Again, go ahead and try for yourself – i’ll wait.  You may wish to capture the build output for review – otherwise, check the pastebin.  I’d like to draw your attention to the « Installing » lines.  By trimming the « Installing /var/tmp/perl-CGI.pm.3.49-1-root-root » element from each of those lines, we can see the actual paths and files that this RPM will install to.  Examples :

/usr/lib/perl5/vendor_perl/5.8.8/CGI.pm
/usr/lib/perl5/vendor_perl/5.8.8/CGI/Cookie.pm
/usr/lib/perl5/vendor_perl/5.8.8/CGI/Util.pm
/usr/share/man/man3/CGI.3pm
/usr/share/man/man3/CGI::Pretty.3pm
/usr/share/man/man3/CGI::Cookie.3pm

At first glance this looks perfectly acceptable.  But look what happens when we try to install the resulting RPM (clipped for brevity) :

[root@host-119 build]# rpm -iv /usr/src/redhat/RPMS/noarch/perl-CGI.pm-3.49-1.noarch.rpm
Preparing packages for installation...
file /usr/share/man/man3/CGI.3pm.gz from install of perl-CGI.pm-3.49-1.noarch conflicts with file from package perl-5.8.8-27.el5.x86_64
file /usr/share/man/man3/CGI::Cookie.3pm.gz from install of perl-CGI.pm-3.49-1.noarch conflicts with file from package perl-5.8.8-27.el5.x86_64
file /usr/share/man/man3/CGI::Pretty.3pm.gz from install of perl-CGI.pm-3.49-1.noarch conflicts with file from package perl-5.8.8-27.el5.x86_64

As it turns out, the Perl package that comes with RHEL / CentOS already contains CGI.pm.  This is normal, since it’s so popular, and is included as a convenience.  Thus, RPM – in an attempt to preserve the coherence of the package management system – refuses to install overtop of the existing owned files.  This is a fine illustration of the first of the two problems previously noted : conflicts over ownership.  As i mentioned above, it’s aggravating, but it’s not a bug – it’s a feature, and it’s doing exactly what it’s designed to do.  Irritating, but not ultimately dire.

If you look carefully, though, it’s also an illustration of the second problem.  Note the list of files that are conflicting.  Look back to the list of files that the package contains – notice anything missing from the conflicts list ?  That’s right – the actual module files (*.pm) are not showing conflicts, which means they’d get overwritten without complaint by RPM.  You might be thinking « who cares ? that’s what i want » right now, but trust me, it’s not what you want.  Imagine this CGI package, with this version of CGI.pm gets installed, and then later you upgrade the Perl package – your CGI.pm files will get overwritten by the Perl package, because as far as RPM is concerned, Perl owns those files.  All of a sudden, things break because you had scripts that relied on your particular version, but since you just upgraded Perl, you think (quite naturally) that the problem could be anywhere – where do you even start looking ?

Imagine the headache if there are multiple administrators, multiple servers, multiple data centres, and multiple clients paying multiple dollars.  No fun at all.

So how can we upgrade CGI.pm, using an RPM, without running into these problems ?  As is often the case, the answer is deceptively simple, but not immediately obvious.  Ultimately what we want to accomplish is twofold :

  • Avoid the man conflicts.
  • Ensure that the existing owned module files are not clobbered by our new package.

Concerning the man pages – and i’m going to be perfectly blunt here – the solution is to simply not install them, since, of course, they’re already there.  As for avoiding a clobbering condition, this requires a little bit of investigation into how Perl modules and libraries are stored on an RHEL / CentOS machine.  Consider the following output :

[root@host-119 ~]# ls -d /usr/lib64/perl5/*
/usr/lib64/perl5/5.8.8  /usr/lib64/perl5/site_perl  /usr/lib64/perl5/vendor_perl

What’s it all mean ?  Well, the « 5.8.8 » directory is the default directory as defined by the Perl architecture, and is system and platform-agnostic, which is to say that it’s (supposed to be) the same on every system.  The « vendor_perl » directory contains everything that specific to RHEL / CentOS (the « vendor » of the distribution).  As you may recall from the rpmbuild output above, this is where the RPM wants to install the modules (thus creating the clobbering condition).

There’s a third directory there, promisingly named « site_perl » ; as the name implies, this is where site-specific files are stored, which is to say items that are neither part of the default Perl architecture, nor part of the RHEL / CentOS distribution.  As you’ve no doubt guessed by now, site_perl is where we’re going to put our new modules.

Luckily for us, the only thing that needs to be changed is the .spec file – and we even get a headstart, since cpanspec does most of the heavy lifting for us.  Examining the .spec file once more, we see the following lines of note (again, cut for brevity) :

%build
%{__perl} Makefile.PL INSTALLDIRS=vendor
%files
%{perl_vendorlib}/*

These indicate that the target installation directory is that of the vendor, which is normally the case, and thus the default setting.  Since we want to install to the site directory, we make the following changes :

%build
%{__perl} Makefile.PL INSTALLDIRS=site
%files
%{perl_sitelib}/*

That solves our clobbering problem quite nicely, but what about the man files ?  As i mentioned above, the idea is to simply avoid installing them altogether, but since they’re generated automatically during the build process, how can we exclude them ?  What i’m about to present is a bit of a hack, but it’s absolutely effective, and ultimately quite clean : we delete them after they’ve been generated, and then don’t declare them in the file list.  Some items are already being potentially deleted by default, so let’s go ahead and add our own line into the mix :

find $RPM_BUILD_ROOT -depth -type d -exec rmdir {} 2>/dev/null ;
# destroy manified man, man.
find $RPM_BUILD_ROOT -type f -name '*.3pm' -exec rm -f {} ;

This will look for all of the « manified » man files and just remove from the build tree.  All that’s left now is to remove them from the file list.  This is as simple as deleting (or commenting out) their sole declaration :

#%{_mandir}/man3/*

Another option is to simply install use the « –excludedocs » argument when installing the RPM.  I opted to remove the docs altogether in order to ensure that the package can be installed without errors by anyone else without needed to know about the argument requirement ahead of time (and to facilitate automated rollouts).

What you’ll end up with is a .spec file that looks like this.  Go ahead and build your RPM – it’ll install without conflicts and without danger.  This is a technique that can be used for other CPAN packages as well, so go ahead and install everything you’ve always wanted.

(complex) partitioning in kickstart

UPDATE: This article was written back in 2009. According to a commenter below, Busybox has been replaced by Bash in RHEL 6; perhaps Fedora as well?

Bonjour my geeky friends ! 🙂  As you are likely aware, it is now summer-time here in the northern hemisphere, and thus, i’ve been spending as much time away from the computer as possible.  That said, it’s been a long time, i shouldn’t have left you, without a strong beat to step to.

Now, if you’re not familiar with kickstarting, it’s basically just a way to automate the installation of an operating environment on a machine – think hands-free installation.  Anaconda is the OS installation tool used in Fedora, RedHat, and some other Linux OS’s, and it can be used in a kickstart capacity.  For those of you looking for an intro, i heavily suggest reading over the excellent documentation at the Fedora project website.  The kickstart configuration process could very easily be a couple of blog entries on its own (which i’ll no doubt get around to in the future), but for now i want to touch on one particular aspect of it : complex partition schemes.

how it is

The current method for declaring partitions is relatively powerful, in that all manner of basic partitions, LVM components, and even RAID devices can be specified – but where it fails is in the creating of the actual partitions on the disk itself.  The options that can be supplied to the partition keywords can make this clunky at best (and impossible at worst).

A basic example of a partitioning scheme that requires nothing outside of the available functions :

DEVICE                 MOUNTPOINT               SIZE
/dev/sda               (total)                  500,000 MB
/dev/sda1              /boot/                       128 MB
/dev/sda2              /                         20,000 MB
/dev/sda3              /var/log/                 20,000 MB
/dev/sda5              /home/                   400,000 MB
/dev/sda6              /opt/                     51,680 MB
/dev/sda7              swap                       8,192 MB

Great, no problem – we can easily define that in the kickstart :

part  /boot     --asprimary  --size=128
part  /         --asprimary  --size=20000
part  /var/log  --asprimary  --size=20000
part  /home                  --size=400000
part  /opt                   --size=51680
part  swap                   --size=8192

But what happens if we want to use this same kickstart on another machine (or, indeed, many other machines) that don’t have the same disk size ?  One of the options that can be used with the « part » keyword is « –grow », which tells Anaconda to create as large a partition as possible.  This can be used along with « –maxsize= », which does exactly what you think it does.

Continuing with the example, we can modify the « /home » partition to be of a variable size, which should do us nicely on disks which may be smaller or larger than our original 500GB unit.

part  /home  --size=1024  --grow

Here we’ve stated that we’d like the partition to be at least a gig, but that it should otherwise be as large as possible given the constraints of both the other partitions, as well as the total space available on the device.  But what if you also want « /opt » to be variable in size ?  One way would be to grow both of them :

part  /home  --size=1024  --grow
part  /opt   --size=1024  --grow

Now, what do you think that will do ? If you guessed « grow both of them to half the total available size each », you’d be correct.  Maybe this is what you wanted – but then again, maybe it wasn’t.  Of course, we could always specify a maximum ceiling on how far /opt will grow :

part  /opt  --size=1024  --maxsize=200000  --grow

That works, but only at the potential expense of /home.  Consider what would happen if this was run against a 250GB disk ; the other (static) partitions would eat up some 48GB, /opt would grow to the maximum specified size of 200GB, and /home would be left with the remaining 2GB of available space.

If we were to add more partitions into the mix, the whole thing would become an imprecise mess rather quickly.  Furthermore, we haven’t even begun to look at scenarios where there may (or may not) more than one disk, nor any fun tricks like automatically setting the swap size to be same as the actual amount of RAM (for example).  For these sorts of things we need a different approach.

the magic of pre, the power of parted

The kickstart configuration contains a section called « %pre », which should be familiar to anybody who’s dealt with RPM packaging.  Basically, the pre section contains text which will be parsed by the shell during the installation process – in other words, you can write a shell script here.  Fairly be thee warned, however, as the shell spawned by Anaconda is « BusyBox », not « bash », and it lacks some of the functionality that you might expect.  We can use the %pre section to our advantage in many ways – including partitioning.  Instead of using the built-in functions to set up the partitions, we can do it ourselves (in a manner of speaking) using « parted ».

Parted is, as you might expect, a tool for editing partition data.  Generally speaking it’s an interactive tool, but one of the nifty features is the « scripted mode », wherein partitioning commands can be passed to Parted on the command-line and executed immediately without further intervention.  This is very handy in any sort of automated scenario, including during a kickstart.

We can use Parted to lay the groundwork for the basic example above, wherein /home is dynamically sized.  Initially this will appear inefficient, since we won’t be doing anything that can’t be accomplished by using the existing Kickstart functionality, but it provides an excellent base from which to do more interesting things.  What follows (until otherwise noted) are text blocks that can be inserted directly into the %pre section of the kickstart config :

# clear the MBR and partition table
dd if=/dev/zero of=/dev/sda bs=512 count=1
parted -s /dev/sda mklabel msdos

This ensures that the disk is clean, so that we don’t run into any existing partition data that might cause trouble.  The « dd » command overwrites the first bit of the disk, so that any basic partition information is destroyed, then Parted is used to create a new disk label.

TOTAL=`parted -s /dev/sda unit mb print free | grep Free | awk '{print $3}' | cut -d "M" -f1`

That little line gives us the total size of the disk, and assigns to a variable named « TOTAL ».  There are other ways to obtain this value, but in keeping with the spirit of using Parted to solve our problems, this works.  In this instance, « awk » and « cut » are used to extract the string we’re interested in.  Continuing on…

# calculate start points
let SWAP_START=$TOTAL-8192
let OPT_START=$SWAP_START-51680

Here we determine the starting position for the swap and /opt partitions.  Since we know the total size, we can subtract 8GB from it, and that gives us where the swap partition starts.  Likewise, we can calculate the starting position of /opt based on the start point of swap (and so forth, were there other partitions to calculate).

# partitions IN ORDER
parted -s /dev/sda mkpart primary ext3 0 128
parted -s /dev/sda mkpart primary ext3 128 20128
parted -s /dev/sda mkpart primary ext3 20128 40256
parted -s /dev/sda mkpart extended 40256 $TOTAL
parted -s /dev/sda mkpart logical ext3 40256 $OPT_START
parted -s /dev/sda mkpart logical ext3 $OPT_START $SWAP_START
parted -s /dev/sda mkpart logical $SWAP_START $TOTAL

The variables we populated above are used here in order to create the partitions on the disk.  The syntax is very simple :

  • « parted -s »  : run Parted in scripted (non-interactive) mode.
  • « /dev/sda » : the device (later, we’ll see how to determine this dynamically).
  • « mkpart » : the action to take (make partition).
  • « primary | extended | logical » : the type of partition.
  • « ext3 » : the type of filesystem (there are a number of possible options, but ext3 is pretty standard).
    • Notice that the « extended » and « swap » definitions do not contain a filesystem type – it is not necessary.
  • « start# end# » : the start and end points, expressed in MB.

Finally, we must still declare the partitions in the usual way.  Take note that this does not occur in the %pre section – this goes in the normal portion of the configuration for defining partitions :

part  /boot     --onpart=/dev/sda1
part  /         --onpart=/dev/sda2
part  /var/log  --onpart=/dev/sda3
part  /home     --onpart=/dev/sda5
part  /opt      --onpart=/dev/sda6
part  swap      --onpart=/dev/sda7

As i mentioned when we began this section, yes, this is (so far) a remarkably inefficient way to set this particular basic configuration up.  But, again to re-iterate, this exercise is about putting the groundwork in place for much more interesting applications of the technique.

mo’ drives, mo’ better

Perhaps some of your machines have more than one drive, and some don’t.  These sorts of things can be determined, and then reacted upon dynamically using the described technique.  Back to the %pre section :

# Determine number of drives (one or two in this case)
set $(list-harddrives)
let numd=$#/2
d1=$1
d2=$3

In this case, we’re using a built-in function called « list-harddrives » to help us determine which drive or drives are present, and then assign their device identifiers to variables.  In other words, if you have an « sda » and an « sdb », those identifiers will be assigned to « $d1 » and « $d2 », and if you just have an sda, then $d2 will be empty.

This gives us some interesting new options ; for example, if we wanted to put /home on to the second drive, we could write up some simple logic to make that happen :

# if $d2 has a value, it's that of the second device.
if [ ! -z $d2 ]
then
  HOMEDEVICE=$d2
else
  HOMEDEVICE=$d1
fi

# snip...
part  /home  --size=1024  --ondisk=/dev/$HOMEDEVICE  --grow

That, of course, assumes that the other partitions are defined, and that /home is the only entity which should be grown dynamically – but you get the idea.  There’s nothing stopping us from writing a normal shell script that could determine the number of drives, their total size, and where the partition start points should be based on that information.  In fact, let’s examine this idea a little further.

the size, she is dynamic !

Instead of trying to wrangle the partition sizes together with the default options, we can get as complex (or as simple) as we like with a few if statements, and some basic maths.  Thinking about our layout then, we can express something like the following quite easily :

  • If there is one drive that is at least 500 GB in size, then /opt should be 200 GB, and /home should consume the rest.
  • If there is one drive is less than 500 GB, but more than 250 GB, then /opt and /home should each take half.
  • If there is one drive that is less than 250 GB, then /home should take two-thirds, and /opt gets the rest.
# $TOTAL from above...
if [ $TOTAL -ge 512000 ]
then
  let OPT_START=$SWAP_START-204800
elif [ $TOTAL -lt 512000 ] && [ $TOTAL -ge 256000 ]
then
  # get the dynamic space total, which is between where /var/log ends, and swap begins
  let DYN_TOTAL=$SWAP_START-40256
  let OPT_START=$DYN_TOTAL/2
elif [ $TOTAL -lt 256000 ]
then
  let DYN_TOTAL=$SWAP_START-40256
  let OPT_START=$DYN_TOTAL/3
  let OPT_START=$OPT_START+$OPT_START
fi

Now, instead of having to create three different kickstart files, each describing a different scenario, we’ve covered it with one – nice !

other possibilities

At the end of the day, the possilibities are nearly endless, with the only restriction being that whatever you’d like to do has to be do-able in BusyBox – which, at this level, provides a lot great functionality.

Stay tuned for more entries related to kickstarting, PXE-based installations, and so forth, all to come here on dan’s linux blog.  Cheers !

how to be properly lazy, with perl !

One of the wonderful things about Perl is that it enables the busy System Administrator to be lazy – and that’s a good thing ! Of course, i don’t mean lazy as in unmotivated, or possesed of a poor work ethic, i mean it in the sense that Perl lets us do as little work as possible in a wide variety of situations. Let’s examine this idea, shall we ?

In the computer world, one often finds themselves doing the same sorts of things over and over again, such as adding a new user to the network, or verifying that the backups executed properly. Usually, these are relatively simple processes which are less about problem solving, and more about following the same set of steps over and over until the desired goal is attained. It is in these situations that the (properly) lazy admin identifies a way to automate as much as possible these processes, so that he or she can get back to more brain-intensive work (this has the net effect of improving overall efficiency and value – see how laziness pays off in the end ? 🙂 )

There are, of course, as many scripting and programming languages as there are grains of sand on a beach, but despite the many competitors and alternatives out there, Perl remains the language of choice for many Linux admins around the world. This is in no small part due to Perl’s ability to manipulate data in a rapid, logical, and easily deployable manner – the most obvious example of this being the vaunted « Perl One-Liner ».

example !

There comes a time in every admin’s life when they must take a bunch of text files, and systematically swap some of the text within with new data – commonly known as searching and replacing.  You could certainly do this by hand using an editor or by using a relatively straightforward C program if you were so inclined.  But there is another way – a better, smarter, lazier way : the Perl search & replace one-liner !  Let’s take a look at the code, then break down each component.

$ perl -p -i -e 's/oldandbusted/newhotness/' *.txt

That’s it, you’re done – take a lap and hit the showers.  So, what exactly just happened there ?  We employed a classic and very common usage method in command-line Perl which can easily be remembered as « pie » :

  • « -p » : In a nutshell, this tells Perl to loop through each line of input, then perform the desired action (in this case, the search & replace) against each of those lines.
  • « -i » : This instructs Perl to actually edit the input files directly (or « in place »), instead of just displaying the changes on the screen.
  • « -e » : This describes exactly one line of code – in this case, the search and replace regular expression…
  • « ‘s/old/new/’ » : This is the regular expression (or « regex ») which Perl will use to perform the search & replace.  (What’s a regex ?  Wikipedia has the answers you seek !)
  • « *.txt » : The target filename – in this case, a simple glob.  (What’s a glob ?  Wikipedia has the answer !)

The key to this whole operation was the fourth bullet point – the regex.  Don’t worry if your regex-fu is not yet strong – this is just an example, and it could have been anything – the point is that Perl can be used to rapidly execute regular expressions on data in simple, easy to execute ways, such as the search & replace one-liner above.  This sort of thing comes in handy on a daily basis, and thus, the perl one-liner is a powerful tool in the System Administrator’s toolbox.

For more one-liners, use the Google : http://www.google.fr/search?q=perl+one-liners