Minimalist’s guide to NetBoot

I recall my fourth grade teacher giving a lesson on issuing directions. He asked every student to write down, with exacting detail, the steps involved in making a peanut butter and jelly sandwich. Then he gathered all the ingredients and attempted to execute the directions. Of all the kids in the class, I came closest to directing him towards making an edible sandwich… but unfortunately I forgot to instruct him to remove the second piece of bread from the bag. The result was as you’d expect: he smeared the PB&J onto the slice of bread while it was still in the bag, going forward as if nothing was wrong. Hilarious at the time, but the point was made.

Turns out that lesson had an application that neither I nor the teacher realized at the time: as a metaphor for various types of programming, and computers in general; the most literal embodiment of “do what I say, not what I mean”. Many of my classmates might have gotten farther had they remembered to include their dependencies in the beginning (you can’t apply jelly if you don’t get jelly first).

And so it was with configuring a generic NetBoot server. There are many ‘NetBoot from a Linux box’ type documents out there, but it seems they all fail to mention at least one key thing you wouldn’t know if you didn’t have a machine running OS X Server to refer to. It was for that reason that I’d been shying away from configuring NetBoot on my network for quite some time.

Recently I came upon a situation where I needed to image a bunch of Macs in a hurry, and NetBoot was the only tool for the job. So I waded through the myriad of tools and documentation, and whittled it all down to the few steps below. Turns out it’s not so bad once you dig down to the right details.

All the most crucial, NetBoot-specific details are outlined below… no bells and whistles, just the things you need to get a Mac booting from your *NIX box. If you’re comfortable with the process of using PXE to feed bootloaders and kernels to PCs (as I am), you’ll have no problem getting NetBoot to work. If not, just follow the directions carefully.

 

tl;dr – There’s no magic in OS X Server that makes NetBoot work… just incorporate the config snippets below and you’re golden.

 

What you’ll need:

  • A DHCP server. I used ISC dhcpd; if you prefere something else, chances are you’ll be able to translate.
  • A TFTP server. From what I can tell, any one will do.
  • An NFS server. Again, nothing fancy is required. (HTTP can also be used, but isn’t covered here)
  • Mac OS X install media. I chose to use the DVD for 10.5, since it supports booting both PowerPC and Intel Macs.
  • Server Admin Tools. Freely available from Apple’s website – download the version which matches your install media (above).
  • A Mac. Used to gather the needed pieces and create images. (In the future I’d like to reverse-engineer the process, thus eliminating the need for a running Mac and make the process more flexible)

How to do it:

I’ll spare you all the details of prepping your *NIX box and installing the services above. You’re no idiot, and chances are you’re already running most of them anyways. The idea here is to integrate NetBoot into your (most likely) existing network, so we won’t waste time reinventing the wheel.

Make the NetBoot set. Download and install the Server Admin Tools. Once that’s done, look inside Applications -> Server for the System Image Utility. Mount your OS X install media, and run the utility.

Once the utility starts, you’re just a few clicks away from creating a NetBoot set – which will include the bootloader, kernel, driver cache, and disk image you need to boot a Mac over the network. So get clicking!

After a while, you’ll be left with a directory like this:

In my case, there’s one disk image and two sets of bootloader/kernel/drivers – one for PowerPC, and one for Intel. There’s also an XML property list, which I assume would provide the set description that an OS X Server would expect to find. (Since we have no such thing, this file is useless and can be ignored.)

TFTP prep. The contents of the “i386″ and/or “ppc” folders are retrieved by the Mac via TFTP. I made a folder called “mac” within my TFTP root, then placed the “i386″ and “ppc” folders within it. Make sure the permissions are correct, and you’re good to go.

NFS prep. Once the Mac has its bootloader, kernel, and drivers, it’s going to need to mount the disk image and finish booting from it. This image can be accessed via NFS – so we’ll make an NFS share and place the image within it. Edit your /etc/exports to add a line like this:

/export/nbi 10.22.0.0/24(async,ro,no_root_squash,insecure,no_subtree_check)

In my case, I wanted to limit access to just the local network. You can adjust yours to your liking.

Needless to say, this directory should contain the disk image (“NetInstall.dmg” in this example), and both it and its contents should be world readable.

DHCP changes. Below you’ll find the relevant portions of my dhcpd.conf – look it over, and modify yours as such.

ignore client-updates;
allow booting;
authoritative;

class "AppleNBI-i386" {
match if substring (option vendor-class-identifier, 0, 14) = "AAPLBSDPC/i386";
option dhcp-parameter-request-list 1,3,17,43,60;
if (option dhcp-message-type = 1) { option vendor-class-identifier "AAPLBSDPC/i386"; }
if (option dhcp-message-type = 1) { option vendor-encapsulated-options 08:04:81:00:00:67; }
filename "mac/i386/booter";
option root-path "nfs:10.22.0.2:/export/nbi:NetInstall.dmg";
}

class "AppleNBI-ppc" {
match if substring (option vendor-class-identifier, 0, 13) = "AAPLBSDPC/ppc";
option dhcp-parameter-request-list 1,3,6,12,15,17,43,53,54,60;
option vendor-class-identifier "AAPLBSDPC";
if (option dhcp-message-type = 1) { option vendor-encapsulated-options 08:04:81:00:00:09; }
elsif (option dhcp-message-type = 8) { option vendor-encapsulated-options 01:01:02:08:04:81:00:00:09; }
else { option vendor-encapsulated-options 00:01:02:03:04:05:06:07; }
filename "mac/ppc/booter";
option root-path "nfs:10.22.0.2:/export/nbi:NetInstall.dmg";
}

Be sure to change the “filename” and “root-path” lines to reflect your configuration.

Ready to run. Once you’ve got everything set, plug a Mac into the network (NetBoot doesn’t work over AirPort for obvious reasons) and turn it on while holding the “N” key. If all goes well, it’ll boot. If not, double-check your configuration. You may also want to hold down Command-V while booting – if the issues are occurring after the kernel is loaded, you’ll be able to see the machine’s console output and possibly diagnose the problem.

 

An icon you’ve never seen before

This would be the floppy disk icon on the Mac OS X desktop, as displayed on a brand-spankin’-new MacBook running 10.6.4.

We’ve seen this icon a few times back in the early days of OS X – you know, when beige Macs were still a common sight. (Yes, kids, Macs never used to be made out of aluminum.)

Back then, customers would routinely try and talk us into “tricking” the latest build of, say, 10.1.x, into running on their Power Mac 7300 with 128MB of RAM and a Sonnet Presto G3 card – and sometimes we would. Such machines actually had floppy drives built into them, so on the rare occasion when someone would insert a floppy disk, said icon would appear on their desktop. Even then we were a bit surprised not to see the generic “white drive” icon instead.

But today, as we connected a USB SuperDisk drive to a thoroughly modern Intel Mac, we were utterly shocked to see it. No AppleTalk, no Rosetta… but yet somehow the floppy icon lives on.

Now if only the Happy Mac would once again show its smiling face, or we could be lulled to the beat of a thousand flying toaster wings on these new Macs, all would be right in the world again.

Or maybe not. (But we still want to see After Dark get ported… hint, hint!)

End of an era: raqpaq and scooby halted

The above picture was what you’d see if you entered raqpaq.kanabecsystems.com into your web browser’s address bar. That is, until 11pm last night.

It was at that time that our two remaining servers at NCIS – raqpaq, our former shared hosting server; andscooby.mnkids.net, the last remaining piece of the once-glorious Kidsnet system – were officially turned off for good.

This marks the end of an era… one which started nearly a decade ago with a humble Linux box serving pages for the Lone Pine community center, and which grew to consume a whole corner of the NCIS “datacenter” at its peak some five years ago.

Times change, though. NCIS isn’t what it used to be, and neither are we. Our needs have changed, and our customers’ expectations aren’t the same as they were a decade ago. Now sites are bigger, more dynamic, and relied on moreso than ever. That’s why we put dala online… and that’s why we’ll probably keep expanding in the years to come.


scooby and the console, in its former home

But for today, give a nod to the tired iron that gave dozens of area establishments their Web presence for so many years. It’ll be torn apart and hauled back to our office next week for a much-deserved retirement on a shelf downstairs. Or perhaps to be reused again someday…

Kidsnet and the Onvoy situation

Kidsnet has not been immune to our recent connection woes (see previous posts). Our magilla.mnkids.net is still hosted in NCIS’s datacenter, and as such was cut off from the Internet on Friday along with our other equipment.

As you may know, Magilla is the Web, mail, primary DNS, and LDAP server for Kidsnet. We have been in the process of moving the LDAP service to Scooby (a Family Pathways-owned machine fed by Sherbtel lines), but the outage cut our efforts short.

When the outage began, our secondary nameserver (puck.nether.net) began taking on the network’s DNS load. But with the LDAP database still on the now-disconnected Magilla, there was no chance for users to carry on as normal (unless they already had a session open – and once they logged out, they were out for good).

We ended up doing two things to try and resolve the situation: moving the LDAP database to Scooby by physically going to Magilla’s console and copying it to disk, and giving Magilla an address on the temporary connection (see previous posts). But it was all in vain.

Turns out that our terminal servers and secondary nameserver are configured in such as way as to create gridlock in this situation.

Whenever a user tries to log in, the LDAP client attempts to connect to “ldap.mnkids.net” (an alias for Magilla) and authenticate. So whenever a user attempted to log in, a DNS query went out – which would be answered by our secondary nameserver, and would return the Onvoy-routed IP address of Magilla.

“No big deal”, you say, “just change the A-record on your secondary nameserver and you’re good to go.” If only it were that easy!

You see, our secondary nameservice is not provided by a machine we control. It’s a free service. And it only accepts updates from… wait for it… the primary nameserver. Which is Magilla. Which is disconnected.

“Okay, then,” you ask, “why not just SSH into the terminal servers and make an entry in /etc/hosts for ldap.mnkids.net, or put Magilla’s new IP into resolv.conf, or ldap.conf?” Because the terminal servers don’t allow root to log in via SSH (security, of course!), and there are no other users in /etc/passwd that can log in at all.

So, we’ve got ourselves a pickle. There were only two solutions to the problem:

1) Visit each site, log in as root, change the settings. For free. While dozens of other (paying) customers are having issues. Not gonna happen.

2) Change the A-records on Magilla, call up the registrar of mnkids.net and ask them to change the IP they have listed for Magilla, wait 24 hours for the changes to propagate, then wait another 12 or so hours for our secondary nameserver to notice and refresh (we can’t force it into a zone transfer – the admin doesn’t seem to allow it).

Needless to say, we had to take Choice #2. Nobody here’s happy about it, and I’m sure the children aren’t exactly thrilled either, but it’s all we can reasonably do right now – especially considering that Kidsnet is unofficially Not My Problem as of 12/22/2009 (it’s only official once Magilla is retired… and I was soooo close!).

The registrar was contacted this morning. Now we wait.

UPDATE 1: 2/3/10 9:00a – the changes have propagated to most of the Internet’s nameservers. About the only one that doesn’t seem to have noticed is Sherbtel’s (208.38.65.35). Unfortunately, most Kidnset equipment is configured to use Magilla (at its old IP) and Sherbtel for their nameservers… so things won’t be back to normal until the changes are reflected there. Needless to say, we won’t be using Sherbtel’s nameserver for lookups anymore after this – its shortcomings have been perhaps the biggest hurdle in this entire situation (even bigger than getting the temporary connection for Magilla!). Google’s 8.8.8.8 will likely be substituted – once I can log in to the many hosts involved, that is. Kidtime is 6 hours away… and the clock is ticking.

UPDATE 2: 2/3/10 6:00p – our new DNS settings have (partially) propagated to Sherbtel’s nameserver… several records are still cached incorrectly, but fortunately ldap.mnkids.net is not among them. Logged into each and every Kidsnet host and changed the primary nameserver to 8.8.8.8. Things now appear to be working correctly. The other Kidsnet-related domains (warehouse214.org, stacyteencenter.com, etc) are not yet working, but that’ll be a project for tomorrow.