How Many Ways Can We Fail Today?

The SGI cluster here at WeSC is beginning to get me down. One of it’s nodes has been down since I started work here and I’ve finally gotten around to looking at it.

Step 1 was to try and get acces to the serial console. After hunting around for a cable I then had to fight with minicom to get it running. A process that would have been significantly easier if the terminals weren’t all runnning at a non-standard baud. Anyway a quick re-boot of the machine showed that it was finding it’s internal disk but failing to find it’s OS. Given the number of times the power has failed recently I wouldn’t be at all surprised if the partition table is corrupted. So it seemed like a re-install was worth trying before getting a replacement disk.

After finding a set of Irix install instructions that I could actually understand I hooked up the ancient external SCSI CD drive and put in the disk that contains the install tools. It took a couple of attempts to convince the drive that it should close but after that it made all the right kinds of whirring noises and I was quietly hopeful.

So boot to the command monitor and:

boot -f cdrom(1,1,7)sash64

And the monitor helpfully responds with a ‘no media found’ message. After trying several other CDs I realized that I couldn’t even ls them never mind run them. The conclusion? Knackered CD drive. Arse.

For my next trick: installing over bootp using an SGI Fuel workstation as a server.