Notes from the AFS Workshop, held at Usenix 2002 in Monterey, CA

Garry Zacheiss, MIT


The workshop began with status reports from representatives of both Arla and OpenAFS.

Love Hornquist-Astrand of the Arla development team presented the Arla status report. There has not been an Arla release since the previous AFS workshop at LISA 2001, but a release was promised within a few days; Arla 0.35.8 has since been released. Improvements scheduled for release soon include improved support for Tru64 UNIX, MacOS X, and FreeBSD, improved volume handling, and implementation of more of the vos/pts subcommands. It was stressed that MacOS X is considered an important platform, and that a GUI configuration manager for the cache manager, a GUI ACL manager for the finder, and a graphical login that obtained Kerberos tickets and AFS tokens at login time were all under development.

Future goals planned for the underlying AFS protocols include GSSAPI/SPNEGO support for Rx, performance improvements to Rx, an enhanced disconnected mode, and IPv6 support for Rx; an experimental patch is already available for the latter. Future Arla specific goals include improved performance, partial file reading, and increased stability for several platforms. Work is also in progress for the RXGSS protocol extensions (integrating GSSAPI into Rx). A partial implementation exists, and work continues as developers find time.

Derrick Brashear of the OpenAFS development team presented the OpenAFS status report. OpenAFS released immediately prior to the conference; OpenAFS 1.2.5 fixed a remotely exploitable denial of service attack in several OpenAFS platforms, most notably IRIX and AIX. Future work planned for OpenAFS includes better support for MacOS X, including working around Finder interaction issues. Better support for the BSDs is also planned; FreeBSD has a partially implemented client; NetBSD and OpenBSD have only server programs available right now; client support is not yet implemented. AIX 5, Tru64 5.1A, and MacOS X 10.2 (aka Jaguar) are all planned for the future. Other planned enhancements include nested groups in the ptserver (code donated by the University of Michigan awaits integration), disconnected AFS, and further work on a native Windows client. Derrick stated that the guiding principles of OpenAFS were to maintain compatibility with IBM AFS, support new platforms, ease administrative burden, and add new functionality.

AFS performance was discussed. The openafs.org cell consists of two fileservers, one in Stockholm, Sweden, and one in Pittsburgh, PA. AFS works reasonably well over trans-Atlantic links, although OpenAFS clients don't determine which fileserver to talk to very effectively. Arla clients use RTTs to the server to determine the optimal fileserver to fetch replicated data from. Modifications to OpenAFS to support this behavior in the future are desired.

Jimmy Engelbrecht and Harald Barth of KTH discussed their AFSCrawler script. This was a script they had written to determine how many AFS cells and clients were in the world, what implementations/versions they were (Arla vs IBM AFS vs OpenAFS), and how much data was in AFS. The script unfortunately triggered a bug in IBM AFS 3.6 derived code, causing some clients to panic while handling a specific RPC. This has since been fixed in OpenAFS 1.2.5 and the most recent IBM AFS patch level, and all AFS users are strongly encouraged to upgrade. No release of Arla is vulnerable to this particular denial of service attack. There was an extended discussion of the usefulness of this exploration. Many sites believed this was useful information and such scanning should continue in the future, but only on an opt-in basis.

Many sites face the problem of managing Kerberos/AFS credentials for batch scheduled jobs. Specifically, most batch processing software needs to be modified to forward tickets as part of the batch submission process, renew tickets and tokens while the job is in the queue and for the lifetime of the job, and properly destroy credentials when the job completes. Ken Hornstein of NRL was able to pay a commercial vendor to support Kerberos 4/5 credential management in their product, although they did not implement AFS token management. MIT has implemented some of the desired functionality in OpenPBS, and might be able to make it available to other interested sites.

Tools to simplify AFS administration were discussed, including:

- AFS Balancer.  A tool written by CMU to automate the process of
  balancing disk usage across all servers in a cell.  Available from:
 
  ftp://ftp.andrew.cmu.edu/pub/AFS-Tools/balance-1.1b.tar.gz

- Themis.  Themis is KTH's enhanced version of the AFS tool "package",
  for updating files on local disk from a central AFS image.  KTH's
  enhancements over traditional "package" include allowing the deletion
  of files, simplifying the process of adding a file, and allowing the
  merging of multiple rule sets for determining what files are updated.
  Themis is available from the Arla CVS repository.

Stanford was presented as an example of a large AFS site. Stanford's AFS usage consists of approximately 1.4 TB of data in AFS, in the form of approximately 100,000 volumes. 3.3TB of storage is available in their primary cell, ir.stanford.edu. Their fileservers consist entirely of Solaris machines running a combination of Transarc 3.6 patchlevel 3 and OpenAFS 1.2.x, while their database servers run OpenAFS 1.2.2. Their cell consists of 25 fileservers, using a combination of EMC and Sun StorEdge hardware. Stanford continues to use the kaserver for their authentication infrastructure, with future plans to migrate entirely to an MIT Kerberos 5 KDC.

Stanford has approximately 3400 clients on their campus, not including SLAC (The Stanford Linear Accelerator); approximately 2100 AFS clients from outside Stanford contact their cell every month. Their supported clients are almost entirely IBM AFS 3.6 clients, although they plan to release OpenAFS clients soon. Stanford currently supports only UNIX clients. There is some on campus presence of Windows clients, but they have never publicly released or supported it. They do intend to release and support the MacOS X client in the near future.

All Stanford students, faculty, and staff are assigned AFS home directories with a default quota of 50MB, for a total of approximately 550GB of user home directories. Other significant uses of AFS storage are data volumes for workstation software (400 GB) and volumes for course web sites and assignments (100 GB).

AFS usage at Intel was also presented. Intel has been an AFS site since 1994. They had bad experiences with the IBM 3.5 Linux client; their experience with OpenAFS on Linux 2.4.x kernels has been much better. They use and are satisfied with the OpenAFS IA64 Linux port. Intel has hundreds of OpenAFS 1.2.3 and 1.2.4 clients in many production cells, accessing data stored on IBM AFS fileservers. As of yet, they have had no interoperability issues. Intel has some concerns about OpenAFS: They would like to purchase commercial support for OpenAFS, and would like to see OpenAFS support for HP-UX on both PA-RISC and Itanium hardware. HP-UX support is current unavailable due to a specific HP-UX header file being unavailable from HP; this may be available soon. Intel has not yet committed to migrating their fileservers to OpenAFS, and are unsure if they will do so without commercial support.

Backups are a traditional topic of discussion at AFS workshops, and this time was no exception. Many users complain that the traditional AFS backup tools ("backup" and "butc") are complex and difficult to automate, and requires many home grown scripts and user intervention for error recovery. An additional complaint was that the traditional AFS tools do not support file or directory level backups and restores; data must be backed up and restored at the volume level.

Mitch Collinsworth of Cornell presented work done at Cornell to make Amanda, the free backup software from the University of Maryland, suitable for AFS backups. Using Amanda for AFS backups allows one to share AFS backup tapes with non-AFS backups, run multiple backups in parallel easily, automate error recovery, and provides a robust degraded mode that prevents tape errors from stopping backups altogether. Their implementation allows for full volume restores as well as individual directory and file restores. They have finished coding this work, and are in the process of testing and documenting it.

Peter Honeyman of CITI at the University of Michigan spoke about work he has proposed to replace Rx with RPCSEC GSS in OpenAFS; this would allow AFS to use a TCP-based transport mechanism, rather than the UDP-based Rx, and possibly gain better congestional control, dynamic adaptation, and fragmentation avoidance as a result. RPCSEC GSS uses the GSSAPI to authenticate SUN ONC RPC. RPCSEC GSS is transport agnostic, provides strong security, is a developing internet standard, and has multiple open source implementations. Backward compatibility with existing AFS servers and clients is an important goal for this project.

Back to the main Usenix02 AFS Workshop page, which contains a list of attendees and proposed topics.

Back to the main AFS Workshop page.