cat /dev/brain

CacheFS + OpenLDAP

A Bit of Background

I haven't mentioned it before, but after graduating from Stevens Institute of Technology, the Computer Science department hired me to work in the Scientific and Research Computing Information Technology (SRCIT) "department". We have approximately 120 desktop client machines with 4,000 users and 9 servers. All of the machines are *nix machines (Ubuntu, Debian, Solaris).

Currently we run Ubuntu 10.04.3 LTS on the client machines and are planning an upgrade to Ubuntu 12.04 LTS. (We are also planning an upgrade of our Solaris server to Solaris 10 and our Debian servers from Debian 4 to Debian 6.) I'm sure other people would cringe at doing this, but I've enjoyed it immensely.

Currently, we use NFS to auto-mount users' home directories from our Solaris server. The automount information is stored in our OpenLDAP database and the LDAP entry typically looks like:

dn: cn=icordasc,ou=auto.home,dc=***,dc=stevens-tech,dc=edu
objectClass: automount
cn: icordasc
automountInformation: -rw,sync,intr,vers=3,rsize=32768,wsize=32768
    (server):/export/home/icordasc

The Problem

One of our clients (a professor) filed a support ticket that their machine was "freezing" on occasion. We checked her machine but found nothing that could be causing the slow-down even though her computer was running at a high load. We narrowed it down to the network, specifically the auto-mounting. Whomever set up the original configuration seems to have found that using the rsize and wsize for NFS were optimal settings (and must not have been using NFS v3 which is when the ability to negotiate optimal settings was introduced). Ever since, the script to create LDAP entries has always used those settings and no one checked (or questioned) if those settings were still optimal.

Our research found that in fact, those settings were no longer optimal (surprised? Probably not since I'd already hinted at it). Using python-ldap, I wrote a quick script to remove these settings which sped up the auto mounting on all of our machines. Awesome, right? Well it gets better.

CacheFS

SRCIT shares Stevens' network with everybody else, so at times, even with the improvement made above, auto-mounting could slow down to a crawl during a normal semester. Wouldn't it be nice if the machine the user is sitting at cached their files? Yes it would, hence using CacheFS.

So far, since we're planning an upgrade, we have only deployed CacheFS to the Ubuntu 12.04 machines we're testing on, but even on those we've seen a huge improvement. Due to our somewhat (seemingly) unique set-up, configuring NFS to use CacheFS wasn't exactly obvious to me (keep in mind, I'm learning a lot of this as I go along, I had no experience with NFS, ZFS, LDAP, Kerberos, etc. before I took this job) and was not easily found through a Google Search. The changes are very simple though. If you reference our old entry, you can ignore the rsize and wsize options to get:

dn: cn=icordasc,ou=auto.home,dc=***,dc=stevens-tech,dc=edu
objectClass: automount
cn: icordasc
automountInformation: -rw,sync,intr,vers=3 (server):/export/home/icordasc

Most of the applications I found where people set up CacheFS to work with NFS/autofs had the automount information in /etc/fstab or /etc/vfstab (Solaris). As you may have guessed, our solution was to apply similar settings to the automount entry in LDAP. Our new entry looks like this:

dn: cn=icordasc,ou=auto.home,dc=***,dc=stevens-tech,dc=edu
objectClass: automount
cn: icordasc
automountInformation: -rw,fsc,sync,intr,vers=3
    (server):/export/home/icordasc

For an Ubuntu machine to use this, you just need to:

# apt-get install cachefilesd
# $EDITOR /etc/default/cachefilesd

And then edit the line (on our installations, line 7) where RUN=yes is commented so that it is uncommented.

After that, make sure you start the daemon, and you can then check /var/cache/fscache/cache/ to make sure it is properly caching your auto-mounted directories. Feel free to enjoy your vastly improved customer service.