CCL Home Software Community Operations |
Makeflow TutorialThis tutorial goes through the installation process of CCTools, the creation and running of a Makeflow, and how to use Makeflow in conjunction with Work Queue to leverage different execution resources for your execution. More information can be found at http://ccl.cse.nd.edu/. For specific information on Makeflow execution see http://ccl.cse.nd.edu/software/manuals/makeflow.html and Work Queue see http://ccl.cse.nd.edu/software/manuals/workqueue.html.
Download and InstallationFirst log into workflow.iu.xsede.org using ssh, putty, or a similar tool. Download and install the cctools software in your home directory as follows:
If you use bash then do this to set your path:
If you use tcsh instead, then do this:
Now double check that you can run the various commands, like this:
Makeflow ExampleLet's being by using Makeflow to run a handful of simulation codes. First, make and enter a clean directory to work in:
Now, download this program, which performs a highly sophisticated simulation of black holes colliding together:
Try running it once, just to see what it does:
Now, let's use Makeflow to run several simulations.
Create a file called example.makeflow and paste the following
text into it:
To run it on your local machine, one job at a time:
Note that if you run it a second time, nothing will happen, because all of the files are built:
Use the -c option to clean everything up before trying it again:
For ease in the next section this can be pasted into the terminal to fetch and build the software:
Running Makeflow in Stampede
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Stampede uses the SLURM batch system, so you can run the jobs through SLURM like this:
Normally, this would submit to the normal queue, but for today there is a priority queue set up:
Running Makeflow in Gordon
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Gordon uses the TORQUE batch system, so you can run the jobs through TORQUE like this:
Normally, this would submit to the normal queue, but for today there is a priority queue set up:
Running Makeflow in Comet
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Comet uses the SLURM batch system, so you can run the jobs through SLURM like this:
Normally, this would submit to the normal queue, but for today there is a priority queue set up:
Running Makeflow in Bridges
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Bridges uses the SLURM batch system, so you can run the jobs through SLURM like this:
Running Makeflow Elsewhere
Makeflow with Work QueueYou will notice that a workflow can run very slowly if you submit each batch job to Torque, because it typically takes 30 seconds or so to start each batch job running. To get around this limitation, we provide the Work Queue system. This allows Makeflow to function as a master process that quickly dispatches work to remote worker processes.Now open up another shell and run a single worker process:
Go back to your first shell and observe that the makeflow has finished.
Of course, remembering port numbers all the time gets old fast,
so try the same thing again, but using a project name:
Now open up another shell and run your worker with a project name:
Running Workers in SLURMOf course, we don't really want to run workers on the head node, so let's instead start one worker using SLURM:XSEDE_2016_1" 1 Creating worker submit scripts in nhazekam-workers... ----------------------------------------------------------------- Welcome to the Stampede Supercomputer ----------------------------------------------------------------- --> Verifying valid submit host (login2)...OK --> Verifying valid jobname...OK --> Enforcing max jobs per user...OK --> Verifying availability of your home dir (/home1/02626/nhazekam)...OK --> Verifying availability of your work dir (/work/02626/nhazekam)...OK --> Verifying availability of your scratch dir (/scratch/02626/nhazekam)...OK --> Verifying valid ssh keys...OK --> Verifying access to desired queue (development)...OK --> Verifying job request is within current queue limits...OK --> Checking available allocation (TG-CHE140058)...OK Submitted batch job 5536202Use the squeue command to observe that they are submitted to SLURM:
Now, restart your Makeflow and it will use the workers already running in SLURM:
You can leave the workers running there, if you want to start another Makeflow. They will remain until they have been idle for fifteen minutes, then will stop automatically. If you add the -d all option to Makeflow, it will display debugging information that shows where each task was sent, when it was returned, and so forth:
Running Workers in TORQUEOf course, we don't really want to run workers on the head node, so let's instead start one worker using TORQUE:-p "-W x=FLAGS:ADVRES:1467960506" 1 Creating worker submit scripts in xdtr16-workers... 2632653.gordon-fe2.localUse the qstat command to observe that they are submitted to TORQUE:
Now, restart your Makeflow and it will use the workers already running in TORQUE:
You can leave the workers running there, if you want to start another Makeflow. They will remain until they have been idle for fifteen minutes, then will stop automatically. If you add the -d all option to Makeflow, it will display debugging information that shows where each task was sent, when it was returned, and so forth:
|