Fighting NFS Mounts at Boot Time

Pulling back from the archives this is a repost of a previous blog post. This time ripped from a guest spot at The Nubby Admin, a fantastic blog from a fellow tech nerd.

Let's face it. NFS is a magical thing. It allows you to centralize your storage, share volumes across systems, and all while maintaining sane permissions and ownership. Unfortunately, it can also be a bit of a fickle beast. Let's say you just had your volume configured and you set up the mounts. You go and run this command:

mount -t nfs 10.10.10.1:/vol1/fs1 /data

Works like a champ, you now have your data partition mounted over NFS. So you add this line to your /etc/fstab and make it mount automagically.

10.10.10.1:/vol1/fs1  /data             nfs     defaults    0 0

A few weeks go by and you apply a kernel update. No big deal, you apply the updates and during your next maintenance window reboot to apply the new kernel. Then you start to see applications failing and notice the volume isn't actually mounted. This is an unfortunate result of the automounter subsystem.

It's like this. At boot time the root partition gets mounted, automounter reads the /etc/fstab file, and boots any filesystem that doesn't have noauto as a mount option. Since we're still very early in the boot process the network isn't up yet, so naturally any network filesystems fail. The real problem here is that at no point does automounter go back and attempt to remount those systems. So your NFS mount points fail because there is no network, and done is done.

The developers were nice enough to provide a fix for this. There exists a mount option called _netdev. If we quote directly from the man page (sourced from RHEL 6.4):

_netdev
        The  filesystem  resides  on  a device that requires network access (used to prevent the system from attempting to mount these filesystems until the
        network has been enabled on the system).

This is awesome, and exactly what we want. So you modify your entry in fstab to look like this:

10.10.10.1:/vol1/fs1  /data             nfs     defaults,_netdev    0 0

You've been bitten by NFS mounting in the past so you throw this in your test environment and reboot immediately. After the system comes up you notice a problem. Your NFS volumes are still unmounted. You see, there's a bit of a hitch. Automounter followed the same procedure that it did before, except this time it didn't even attempt to mount /data. The _netdev option doesn't tell the system to mount the filesystem when network comes up, it says don't attempt to mount it at all if the network isn't up. There is still a missing piece to the puzzle. If you look at your init scripts there is a service called netfs. If you read the script you can see in the chkconfig header this description:

# description: Mounts and unmounts all Network File System (NFS), \
#              CIFS (Lan Manager/Windows), and NCP (NetWare) mount points.

This is exactly what you need. It is a service whose sole purpose is to read your /etc/fstab and mount network filesystems. All you have to do is enable it

chkconfig netfs on

and watch the magic happen. Now your mount boot process should look something like this:

  1. Automounter reads /etc/fstab
  2. Ignores /data since it has _netdev option set
  3. Mounts all other filesystems
  4. Finishes mount jobs and allows system to continue booting
  5. Network comes up
  6. Service netfs started
  7. netfs reads /etc/fstab and finds an nfs filesystem
  8. netfs mounts /data

What's funny is that while I was researching this problem I never stumbled across netfs as a service. I had even gone so far as to start planning out my own custom init script that would do exactly this, except specifically for my mount points instead of generalizing. It's nice to see that I was on the right track, but even better that the tools already existed.