Notes from the AFS Workshop, held at LISA 2002 in Philadelphia, PA

Garry Zacheiss, MIT


The workshop began with status reports from representatives of both Arla and OpenAFS.

Love Hornquist-Astrand of the Arla development team presented the Arla status report. The current released version of Arla is 0.35.10, which supports all *BSD Unix variants, including MacOS, and Linux. According to Love, the Arla 0.35 branch is starting to show its age, and a 0.36 release is expected to branch before the end of the year. This release will include Themis, their "package" utility replacement, which includes features and extensions not found in the traditional AFS "package" utility. Themis should be a drop-in replacement for "package".

Improvements in Arla 0.36 will include support for incremental open and support for UUID based callbacks (via the WhoAreYou RPC). Additionally, the afs3-callback port used by Arla will change from 7111/udp to 7001/udp, and XFS will be renamed to NNPFS. Windows support will also be present in this release, along with a GUI ACL manager for MacOS X that integrates with the Finder. The MacOS X ACL manager will also work with the OpenAFS MacOS X client.

Future goals include implementation of a cleaner and faster kernel/userland interface, and the addition of IPv6 support for AFS.

Love reports that work on integrating Kerberos 5 and GSSAPI into Rx continues. rxkad 2b, which will add Kerberos 5 support to Rx while still using fcrypt for encryption, will appear in a future OpenAFS release, most likely OpenAFS 1.2.8.

Derrick Brashear presented the OpenAFS status report. OpenAFS recently celebrated its two year anniversary. Recent progress in OpenAFS includes the addition of fakestat; with this feature enabled, the AFS client will provide stat information for volume mountpoints not yet traversed without contacting remote fileservers. This allows the use of graphical file managers to browse /afs without causing excessive hangs and timeouts. This feature is present in OpenAFS 1.2.7; OpenAFS 1.2.8 will include a further refinement to only present this behavior for mountpoints to volumes in foreign cells. Other recent features include ports to MacOS X 10.2 and an experimental port to FreeBSD, further Linux client tuning, and modifications to the fileserver to use Rx pings to determine if clients are reachable before allocating threads to them; this prevents asymmetric clients from consuming all available fileserver threads.

Issues that OpenAFS is currently facing include recent RedHat Linux kernels, which break the OpenAFS client by no longer exporting the symbol sys_call_table; this symbol is also not exported in any Linux 2.5.x kernels, and will not be exported in the 2.6 Linux series. A workaround is currently in place in the Linux RPMs distributed from openafs.org, and will also appear in source form in the OpenAFS 1.2.8 release. In the long term, communication with the Linux kernel developers indicates that generic PAG support and an AFS system call hook will appear in the Linux kernel, and we will no longer need to rely on the presence of sys_call_table. There are also plans to modify the Linux OpenAFS client in the future to make use of procfs for client configuration and tuning.

The AFS client written by Redhat, and which appeared in the Redhat 8.0 beta kernels was also discussed. This client is currently very minimal, and does not implement caching, writing, PAGs, pioctl, or authentication; it is currently only useful for unauthenticated read-only access to AFS. There are plans to implement the missing features, but for now, using this client doesn't seem interesting or feasible for any sites that make heavy use of AFS.

Coming soon in OpenAFS is an HPUX 11 port, which was made possible by HP releasing the kernel header file needed for the AFS client to build; this will appear in OpenAFS 1.2.8. Also appearing in OpenAFS 1.2.8 is rxkad 2b, which was discussed briefly in the Arla status report. This modification to rxkad removes AFS's dependency on Kerberos 4. It involves running a modified krb524d (present in the MIT Kerberos 1.2.6 release) which will respond to AFS service requests with the encrypted portion of a Kerberos 5 ticket, not a Kerberos 4 ticket. This code is currently being tested and is known to work with UNIX clients and servers; testing with the Windows client is not yet complete.

The OpenAFS status report concluded with a summary of OpenAFS issues and ways to get in touch with the OpenAFS project. Bug reports can be submitted to openafs-bugs@openafs.org; openafs-elders@openafs.org reaches the OpenAFS Council of Elders, which can help find resources for implementing desired new features. OpenAFS continues to have no full-time developers, and is still operating entirely as a volunteer effort. Windows client development has been lagging recently due to lack of resources, but there are several hopefully possibilities for this changing in the foreseeable future.

Interest was expressed in OpenAFS ports to HPUX for the Itanium, and to AIX 5.1 and later. There is no timeframe for the HPUX/Itanium port due to lack of resources and access to hardware/software. A partial AIX 5.1 client was recently contributed, but requires further work to be stable. There was also interest in the status of the ptserver supporting groups being members of groups; OpenAFS has code to do this from UMich, but it has not yet been integrated. Disconnected AFS will also be contributed by UMich, once they have finished integrating their OpenBSD client.

Rudy Maceyko of CERT briefly discussed CERT's ongoing transition from Transarc AFS to OpenAFS and MIT Kerberos 5. Their cell is approximately 120 GB of data; in the process of migrating to OpenAFS/MIT Kerberos they are also changing server platforms from Solaris to Linux, and have seen significant performance improvements. The transition thus far has been very smooth, and they expect it to be complete by the middle of November.

Brian Sebby of ANL discussed how ANL is using AFS with their firewall. Ports that must be open in your firewall to use AFS are UDP ports 7000-7009; 7020,7021, and 7025-7032 must be open if you need the AFS backup software (backup/butc/buserver) to work through the firewall. Ports 88 and 750 are necessary for authentication from Windows AFS clients. UDP ports 111 and all ports > 1024 must be open for AFS/NFS translator to be used; TCP port 2049 is all that appears to be necessary for using the AFS/NFS translator via Sun's WebNFS. Brian stressed the importance of using your firewall logs to determine what ports are required.

Ken Hornstein of NRL gave a talk on using Kerberos 5 with AFS. He summarized the current state of things: AFS supports krb4 only; the Kerberos 5 aklog that exists talks to krb524d to convert a krb5 ticket to a krb4 one, and laces the krb4 ticket in the kernel as a token. NRL has been using the Kerberos 5 aklog for 3 years without problems; the problems that Ken has seen with other sites are mostly a result of having the AFS service key in the fileserver's keytab and the AFS service key in the KDC database not match.

Additionally, sites that need to support klog can use various approaches to handle kaserver requests out of your KDC database. Heimdal has the nicest implementation for this approach; the Heimdal KDC can handle kaserver requests natively. For sites using MIT Kerberos 5, the program "fakeka" can be used to handle kaserver requests out of your KDC; additionally, you'll need to run "kaforwarder" on your AFS database servers if you run your Kerberos 5 KDC on a machine other than your AFS database servers. An additional compilation to this approach is the Windows client, which uses native krb4 for authentication instead of an Rx based protocol. The kaserver implements a krb4 KDC, so this normally just works, but won't if you use fakeka. NRL solved this problem by migrating all Windows clients to use aklog.

Ken also discussed his AFS Kerberos 5 migration kit. Development on it was stalled for some time because NRL was running MIT krb5 1.0.6. NRL is in the process of updating to MIT krb5 1.2.6, and Ken is updating his migration kit to build against newer releases of MIT Kerberos. At this point, some parts of the migration kit might go into OpenAFS and/or MIT Kerberos. Ken may end up doing another release of the migration kit before this happens.

Love Hornquist-Astrand discussed some performance benchmarks he had compiled for AFS fileservers on various platforms, including Solaris, FreeBSD x86, Linux x86, and Tru64 UNIX. According to his numbers, the current implementation of fcrypt in Transarc and OpenAFS causes a 2/3 performance degradation on Solaris. A faster fcrypt implementation will appear in a future OpenAFS release. The benchmarks also show that unencrypted traffic to a Linux fileserver is very fast, but that adding encryption causes a larger performance penalty than on Solaris. One conclusion that was drawn from the Solaris benchmarks is that Sun's software RAID product causes fileserver operations that update metadata to become very slow, and should be avoided.

AFS performance in the case of the client and the server being the same machine was briefly discussed; there doesn't seem to be any significant performance gain to accessing the AFS server via the loopback interface in this case. This was experimented with at Sunsite Germany; the consensus reached indicated that Sunsite Germany had higher performance requirements than AFS could provide.

Heidi Hornstein from NRL discussed the problem of users making the top level of their home-directory world readable, and inadvertently allowing things like PGP and SSH private keys to be world readable as a result. Common solutions to this problem include creating a symlink farm for dotfiles to a subdirectory of the user's home directory, and making the top level only be world listable, although this doesn't prevent users from changing the default permissions themselves.

Brian Sebby asked for reactions to IBM's End Of Life announcement of AFS. Most sites are indifferent to this announcement or happy about it, and almost everyone is in the process of migrating to OpenAFS or has already completed a migration, with some exceptions: modern versions of HPUX, AIX, and Tru64 UNIX are all still dependent on IBM AFS clients, and almost all sites are still using IBM clients for their Windows machines.

Mitch Collinsworth from Cornell provided a status update on work in progress to use Amanda for AFS backups. It is done and being used for testing dumps, but restores still need testing; the system is not yet in production. We also heard an update on AFSFree, a graphical tool for monitoring free space on all vice partitions in a cell. AFSFree is a Tcl/Tk script, and is available from: /afs/msc.cornell.edu/common/ftp/pub/AFS/afsfree/ It was observed that AFSFree uses "vos listaddrs" to determine what servers comprise a cell. Many old AFS sites have many IP entries in their VLDB for former fileservers, so this approach will not work.

There is interest in using AFS clients on MacOS X 10.2, and in setting up MacOS X machines to log in users with AFS tokens, a PAG, and an AFS home directory. Using the KerberosLoginAuthenticator will get users tickets at login time. Users don't currently get a PAG at login time, and implementing this is expected to be somewhat challenging. Sites with deployed clusters of MacOS X machines using AFS indicate that the machines are being used in single user settings, and the lack of a PAG isn't causing them significant problems. There are still bad interactions between the Finder and AFS; in particular, the Finder caches and enforces UFS mode permissions on all files, including files in AFS. Arla partially gets around this problem by calculating UFS mode bits based on the user's credentials and the directory ACL.

Ways of preventing data corruption when an AFS partition fills were discussed. It is currently possible to cause some corrupt data on an AFS partition when metadata update attempts fail, leaving the data in an inconsistent state. One idea that was proposed for avoiding this was a fileserver modification to create a thread that would periodically check partition fullness and refuse writes beyond a certain fullness threshold.

The workshop closed with a roundtable discussed on what AFS needs to do to gain more market share. Support for files larger than 2GB, byte-range file locking, better support for Windows clients, and more training opportunities and documentation were all cited as being desirable for AFS to gain additional market share.