Running GAMESS on a Rocks Cluster
February 27, 2014
One of the early barriers to any project involving computers seems to be the issue of getting everything working. Our BLESS cluster at Biola University was no exception. This cluster consists of 40 Dell desktop computers connected by ethernet switches. My colleagues Dr. John Bloom and Dr. Xidong Chen had assembled the hardware and configured the Rocks cluster, respectively. All was running, but how was I to install the GAMESS quantum chemistry software in the Rocks environment?
The Rocks package seeks to make the management of a Linux cluster simple by automating the setup of the multiple nodes, and by handling package installation. The "installers" for particular applications are called "Rolls" in the Rocks parlance. A GAMESS "Roll" (several different ones, in fact) were obtained and some of these even seemed to install correctly. But jobs would not run, despite my familiarity with building and running GAMESS on a single dedicated Linux box. Unfortunately, the very "prepackaged" nature of "Rolls" made them (for me) very opaque with respect to troubleshooting.
What follows is my own attempt to document, partly for my own satisfaction, how the GAMESS software actually runs on (our) Rocks cluster. All the usual caveats apply: the GAMESS package can be configured in many ways, and YMMV. But (as is usual in most of these situations) things were actually simpler than they appeared at first, and so this is intended to serve not as an "instruction manual" but rather a high-level overview to help you puzzle through your own GAMESS installation, assuming that you (as I was initially) are unclear as to exactly how GAMESS is supposed to run on a Rocks cluster.
I will assume that you are able to build, configure, and run GAMESS on a single workstation, utilizing multiple cores. How to do that is fairly well documented in the GAMESS package instructions, and elsewhere on the net. BTW, I run GAMESS in the "Sockets" configuration, not MPI. Our cluster's private LAN is not fast, and the "Sockets" configuration is robust in this case. If you are running with MPI these suggestions will probably not help.
All of the initial problems I had with running GAMESS on our Rocks cluster boiled down to three issues:
The "stock" GAMESS installation produced by the "Roll" assumed that the cluster used the "Portable Batch System" (PBS) as its job scheduler. Our Rocks cluster was configured with the Sun Grid Engine (SGE) scheduler instead, and several of the environment variables in the "default" shell scripts that launch GAMESS jobs had to be modified to make things work.
I modified the GAMESS shell script so that compatibility with both the PBS and SGE schedulers is maintained. Which scheduler is present on a cluster can be determined by testing the existence of $PBS_JOBID (for the PBS) or just $JOB_ID for the SGE. See the script 'gamess' in the tarball attached below for the SGE-specific changes. Note that among the mods for SGE are some changes that correctly parse the list of nodes assigned by SGE so that GAMESS knows where to run itself.
Management of the compute nodes' scratch directories. When GAMESS runs on our cluster, the various "scratch" files used by each individual node must have a place to live that is rapidly accessible by each compute node. This aspect of the cluster had me quite confused: which of the GAMESS files were to be located on the "frontend" node of the cluster, and which were to be located on the nodes themselves?
I also didn't realize, initially, that in a GAMESS run, not all compute nodes are not created equal. When a GAMESS job is launched, its input file is passed to a "master node" that acts as the "clearinghouse" (sorry for the nontechnical expressions here, I'm just a chemist) for the work being done by the other "slave" nodes. Some files need to be kept on the master node (because they need not be accessed by the slave nodes), but most must be individually maintained local to the slave nodes.
The "stock" GAMESS script will try to point the SCR (scratch directory) to the current directory ($PWD), which will resolve to somewhere that will be inaccessible to the slave nodes. On the Rocks Cluster, each individual node has its own largest local volume named /state/partition1, and so we want to set SCR to point to that directory. See my 'gamess' shell script. Beyond changing where SCR points, I included a loop in the shell script that creates the GAMESS-expected "scr" directory on each assigned node, and (if necessary) purges any existing scratch files from the node to avoid filling the disks with 'orphaned' files (e.g. from aborted runs). A copy of the input file is placed on EACH node.
GAMESS Run Settings. Because our cluster does not have an extraordinarily fast private LAN, my jobs run fastest when I avoid inter-node communication as much as possible. For this reason (and because it requires memory configuration of the whole cluster, of which I am but one user) I do NOT use the GAMESS DDI memory capability. I specify only local memory using $SYSTEM PARALL=.TRUE. MWORDS = 300 $END in my calculations.
With larger molecules, even having "scratch" files on the local nodes' disks is not fast enough, and things run faster if I use the DIRSCF=.TRUE. flag in the $SCF section of the input deck. This means all SCF integrals are recalculated on the fly, which increases CPU effort, but it's faster to recalculate than to recover the previous results from disk. In benchmarks with my personal linux_64 box, even an Intel SSD was too slow with DIRSCF=.FALSE. -- my calculations predominantly were waiting for SSD access during calculations with molecules of N=48 and up. So even with a solid state drive on every node, I would not expect our cluster to benefit over DIRSCF=.TRUE. But again, your mileage may vary.
HERE is a tarball containing modified versions of the scripts that get generated by the GAMESS roll. It includes commands that manage scratch directories on the nodes assigned to the GAMESS jobs by the SGE engine. It is NOT intended to work on your cluster, but rather to be a "study guide" for your own modifications to get a new system running. Here is a brief description of each file:
gamess: This script includes most all of the changes discussed above.
gmsrun.bat: This script calls the gamess script above, and passes it the version of gamess to run, and also specifies default values for the version of GAMESS (we can have more than one on the cluster, it's 00 by default), number of cores, naming of the log file, etc.
rungamess: This script is what I actually use to run GAMESS jobs. It makes it much easier since it's not necessary to remember SGE or GAMESS parameters. This script takes only two arguments-- the input file, and the number of cores. So the user changes to ~/gamess and types (for example):
./rungamess myfile 24
Which starts a run using the input file ~/gamess/myfile.inp (must have that extension) using 24 cores (e.g. 6 nodes, if you have quad core nodes). The script names the job "myfile" and runs GAMESS version 00 by default. Notice that this script calls qsub and passes the SGE parameters to gmsrun.bat (above). The parameters to the qsub command override the default SGE parameters that are in the gmsrun.bat file, since SGE gives preference to the qsub command parameters.
Good Luck and Good Computing!
> Computational Chemistry
> NIR Fluorescence
Articles, media, art, and figures are ©2015 by John Silzel, Ph.D. All rights reserved.
For permission to reproduce or use these materials, please contact info (at) silzel.com