Category Archives: SysAdmin

A Lack of Organisation on Your Part Does Not Constitute an Emergency on Mine

When we discuss a piece of monitoring software that you would like writing and I tell you that we can collect any data that you want, but you have to tell me what you want reports on up-front so that the data can be collected. If you then send me those requirements 24 hours before you need the report generating my response will not be “right away Sir. I’ll magically collect all those data you didn’t tell me you needed.”, My response will be pithy and anglo-saxon.

Love and Hugs,
Huw

Dealing With Stupid Programs That Think They Need X

The new compute cluster is beginning to feel like a production system. I’m currently run off my feet installing software for the stream of new users. Mostly this is fine, but occasionally I run into software that makes me want to band my head repeatedly on my desk until the pain goes away; or more accurately makes me want to bang the programmer’s head on the desk.

Just today we received a linux port of a code that has been running on the Windows Condor pool for a while now. Everything seemed fine except for it’s stubborn refusal to run if it couldn’t find a windowing system. Bear in mind that it doesn’t actually produce any graphical output it just dies if it can’t connect to X. After a bit of futzing around we discover that the people that normally run this code do something like:

Xvfb :1 -server 1 1024x1024x8 &
export DISPLAY=:1
./stupid_code_that_wants_X

Xvfb is the X virtual framebuffer. It creates a running X client without actually needing any graphics to be running.

Which works just great locally but if you want to launch that as a script in the job scheduling system (we use PBSpro) then you need to be a bit more careful. What happens if two of these jobs try to launch on the same machine? Obviously one of them will fail because display 1 is already allocated. What I really needed was a script that will try to launch Xvfb and increment DISPLAY on failure until it finds a display that is free. For your edification here it is:

get_xvfb_pid () {
	XVFB_PID=`ps -efww | grep -v grep | grep Xvfb |\
       grep $USERNAME | tail -n 1 | awk '{print $2}'`
	}

create_xvfb () {
	USERNAME=`whoami`
	DISPLAYNO=1
	while [ -z $xvfb_success ]
		do
		get_xvfb_pid
		old_XVFB_PID=$XVFB_PID
		XVFB_PID=""
		Xvfb :${DISPLAYNO} -screen 0 1024x1024x8 >& /dev/null &
		sleep 1
		get_xvfb_pid
		if ! [ -z $old_XVFB_PID ]
			then
			if [ -z $XFVB_PID ] && ! [ $XVFB_PID == $old_XVFB_PID ]
				then
				echo "Started XVFB on display $DISPLAYNO process $XVFB_PID"
				xvfb_success=1
			else
				DISPLAYNO=$(($DISPLAYNO + 1))
				XVFB_PID=""
			fi
		else
			if [ -z $XFVB_PID ]
                                then
                                echo "Started XVFB on display $DISPLAYNO process $XVFB_PID"
                                xvfb_success=1
                        else
                                DISPLAYNO=$(($DISPLAYNO + 1))
                                echo "FAIL!" $XVFB_PID
                                XVFB_PID=""
                        fi
		fi
 		done
	export XVFB_PID
	export DISPLAY=:${DISPLAYNO}
	}

kill_xvfb () {
	kill $XVFB_PID
	}

Which you can call from a script like thus:

[arccacluster8]$. ./xvfb_helper
[arccacluster8]$ create_xvfb
Started XVFB on display 1 process 9563
[arccacluster8 ~]$ echo $DISPLAY
:1
[arccacluster8 ~]$ echo $XVFB_PID
9563
[arccacluster8 ~]$ ps -efw | grep Xvfb
username    9563  9498  0 19:31 pts/8    00:00:00 Xvfb :1 -screen 0 1024x1024x8
[arccacluster8 ~]$ kill_xvfb
[arccacluster8 ~]$ ps -efw | grep Xvfb
[arccacluster8 ~]$

I submit that this is a disgraceful hack, but it might come in handy to someone else.

Moving ZFS filesystems between pools

When I originally set up the ZFS on my development v880 I added the internal disks as a raidz together with two volumes off the external fibre-channel array. As is the way with these things the development box has gradually become a production box. And I now realise that if the server goes pop I can’t just move the fibre-channel to another server because the ZFS pool contains that set of internal scsi disks.

To my horror I now discover that you can’t remove a top-level device (vdev in ZFS parlance) from a pool. Fortunately I have two spare volumes on the array so I can create a new pool and transfer the existing zfs filesystems to it. Here is a quick recipe for transferring zfs filesystems whilst keeping downtime to a minimum.

zfs snapshot oldpool/myfilesystem@snapshot1

zfs send oldpool/myfilesystem@snapshot1 | zfs receive newpool/myfilesystem

this will take a while but the filesystem can stay in use while you are doing it. Once this finishes you need to shut down any services that are relying on the filesystem and unmount it.

zfs unmount oldpool/myfilesystem

And take a new snapshot.

zfs snapshot oldpool/myfilesystem@snapshot2

you can now do an incremental send of the difference between the two snapshots which should be very quick.

zfs send -i oldpool/myfilesystem@snapshot1 \
             oldpool/myfilesystem@snapshot2 | zfs receive newpool/myfilesystem

Now you can point the services at the new filesystem and start over until all the filesystems on the original pool have been transferred.

One of those days

Due to a scarcity of meetings (a rare thing these days) I thought I might actually be able to get some work done today. What I actually ended up doing was

  1. trying to work out why my Solaris 10 V880 decided to reboot
  2. cobbling together enough spare parts to build a workstation after mine went pop.

On the plus side Fedora 6 runs surprisingly well on a Pentium III with 256MB of RAM. Getting the BIOS to believe that it really did have a 120GB hard drive took a bit of work though.

Here’s hoping for a better day tommorrow.

Big Disks?

We’re probably going to need a large amount of disk space shortly. It’s basically somewhere to back things up so it doesn’t need to be terribly fast. I’ve been having a look around and I’ve come up with two possibilities.

Sun X4500

  • 24TB SATA
  • 4U
  • Software RAID (ZFS)
  • well engineered
  • Sun support
  • 20k with academic pricing

DNUK Teravault

  • 27TB SATA
  • 6U
  • Hardware RAID (Areca 110)
  • DNUK rails are usually horrid
  • 13k full price

The x4500 is smaller and I know it will be less hassle to physically install. But the DNUK box is a lot cheaper and has more storage. From looking at the hardware specs I think that the x4500 is the superior product but I’ve no reason to believe the Teravault won’t get the job done.

If anyone has had hands-on experience of either box I would really like to here about it.

It’s 2am: do you know where your ZFS pools are?

No, no I don’t.

wesc21-comsc# zpool list
no pools available
wesc21-comsc# df -h
Abort (core dumped)

Bollocks. I’m beginning to think this machine is cursed. The mounted ZFS filesystems are still there and apear to be functioning so I guess I can fix this tomorrow. I don’t really want to interrupt the MySQL database that’s indexing 40GB of data.

Strangely googling for “Where the hell did my ZFS go?” doesn’t return any usefull results.