CCL | Software | Install | Manuals | Forum | Papers
CCL Home

Research

Software Community Operations

Makeflow Tutorial

This tutorial goes through the installation process of CCTools, the creation and running of a Makeflow, and how to use Makeflow in conjunction with Work Queue to leverage different execution resources for your execution. More information can be found at http://ccl.cse.nd.edu/. For specific information on Makeflow execution see http://ccl.cse.nd.edu/software/manuals/makeflow.html and Work Queue see http://ccl.cse.nd.edu/software/manuals/workqueue.html.

Download and Installation

First log into workflow.iu.xsede.org using ssh, putty, or a similar tool. Download and install the cctools software in your home directory as follows:
$ cd $HOME
$ wget http://ccl.cse.nd.edu/software/files/cctools-5.4.16-source.tar.gz
$ tar xvzf cctools-5.4.16-source.tar.gz
$ cd cctools-5.4.16-source
$ ./configure
$ make
$ make install
$ cd $HOME
If you use bash then do this to set your path:
$ export PATH=${PATH}:${HOME}/cctools/bin
If you use tcsh instead, then do this:
$ setenv PATH ${PATH}:${HOME}/cctools/bin
Now double check that you can run the various commands, like this:
$ makeflow -v
$ work_queue_worker -v
$ work_queue_status

Makeflow Example

Let's being by using Makeflow to run a handful of simulation codes. First, make and enter a clean directory to work in:
$ cd $HOME
$ mkdir tutorial
$ cd tutorial
Now, download this program, which performs a highly sophisticated simulation of black holes colliding together:
$ wget http://ccl.cse.nd.edu/software/tutorials/xsede16/simulation.py
Try running it once, just to see what it does:
$ chmod 755 simulation.py
$ ./simulation.py 5
Now, let's use Makeflow to run several simulations. Create a file called example.makeflow and paste the following text into it:
input.txt:
    LOCAL /bin/echo "Hello Makeflow!" > input.txt

output.1: simulation.py input.txt
    ./simulation.py 1 < input.txt > output.1

output.2: simulation.py input.txt
    ./simulation.py 2 < input.txt > output.2

output.3: simulation.py input.txt
    ./simulation.py 3 < input.txt > output.3

output.4: simulation.py input.txt
    ./simulation.py 4 < input.txt > output.4
To run it on your local machine, one job at a time:
$ makeflow example.makeflow -j 1
parsing example.makeflow...
checking example.makeflow for consistency...
example.makeflow has 5 rules.
starting workflow....
submitting job: /bin/echo "Hello Makeflow!" > input.txt
submitted job 32447
job 32447 completed
submitting job: ./simulation.py 4 < input.txt > output.4
submitted job 32451
job 32451 completed
submitting job: ./simulation.py 3 < input.txt > output.3
submitted job 32461
job 32461 completed
submitting job: ./simulation.py 2 < input.txt > output.2
submitted job 32467
job 32467 completed
submitting job: ./simulation.py 1 < input.txt > output.1
submitted job 32473
job 32473 completed
nothing left to do.
Note that if you run it a second time, nothing will happen, because all of the files are built:
$ makeflow example.makeflow -j 1
parsing example.makeflow...
checking example.makeflow for consistency...
example.makeflow has 5 rules.
recovering from log file example.makeflow.makeflowlog...
starting workflow....
nothing left to do.
Use the -c option to clean everything up before trying it again:
$ makeflow -c example.makeflow
For ease in the next section this can be pasted into the terminal to fetch and build the software:
cd $HOME
wget http://ccl.cse.nd.edu/software/files/cctools-5.4.16-source.tar.gz
tar xvzf cctools-5.4.16-source.tar.gz
cd cctools-5.4.16-source
./configure
make
make install
cd $HOME

Running Makeflow in Stampede

First we need to login into Stampede via login.xsede.org, this is done:
$ gsissh stampede
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Stampede uses the SLURM batch system, so you can run the jobs through SLURM like this:
$ makeflow -T slurm -B "-p normal -t 60" example.makeflow
Normally, this would submit to the normal queue, but for today there is a priority queue set up:
$ makeflow -T slurm -B "-p normal-mic -t 60 --reservation=XSEDE_2016_1" example.makeflow

Running Makeflow in Gordon

First we need to login into Gordon via login.xsede.org, this is done:
$ gsissh gordon
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Gordon uses the TORQUE batch system, so you can run the jobs through TORQUE like this:
$ makeflow -T torque example.makeflow
Normally, this would submit to the normal queue, but for today there is a priority queue set up:
$ makeflow -T torque -p "-W x=FLAGS:ADVRES:1467960506" example.makeflow

Running Makeflow in Comet

First we need to login into Comet via login.xsede.org, this is done:
$ gsissh comet
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Comet uses the SLURM batch system, so you can run the jobs through SLURM like this:
$ makeflow -T slurm -B "-p compute -t 60" example.makeflow
Normally, this would submit to the normal queue, but for today there is a priority queue set up:
$ makeflow -T slurm -B "-p compute -t 60 --res=XSEDE16Mats" example.makeflow

Running Makeflow in Bridges

First we need to login into Bridges via login.xsede.org, this is done:
$ gsissh bridges
Next you will want to repeat the steps listed in Download and Installation as well as Makeflow example. Bridges uses the SLURM batch system, so you can run the jobs through SLURM like this:
$ makeflow -T slurm -B "-p RM-shared" example.makeflow
Running Makeflow Elsewhere

We also support other batch systems such as SGE and HTCondor. If you have a Condor Pool, then you can direct Makeflow to run your jobs there:
$ makeflow -T condor example.makeflow

Makeflow with Work Queue

You will notice that a workflow can run very slowly if you submit each batch job to Torque, because it typically takes 30 seconds or so to start each batch job running. To get around this limitation, we provide the Work Queue system. This allows Makeflow to function as a master process that quickly dispatches work to remote worker processes.
$ makeflow -c example.makeflow
$ makeflow -T wq example.makeflow -p 0
listening for workers on port XXXX.
...
Now open up another shell and run a single worker process:
XXXX
Go back to your first shell and observe that the makeflow has finished. Of course, remembering port numbers all the time gets old fast, so try the same thing again, but using a project name:
$ makeflow -c example.makeflow
$ makeflow -T wq example.makeflow -p 0 -N MYPROJECT-$USER
listening for workers on port XXXX...
Now open up another shell and run your worker with a project name:
$ work_queue_worker -N MYPROJECT-$USER

Running Workers in SLURM

Of course, we don't really want to run workers on the head node, so let's instead start one worker using SLURM:
XSEDE_2016_1" 1
Creating worker submit scripts in nhazekam-workers...                                 
-----------------------------------------------------------------
              Welcome to the Stampede Supercomputer
-----------------------------------------------------------------

--> Verifying valid submit host (login2)...OK
--> Verifying valid jobname...OK
--> Enforcing max jobs per user...OK
--> Verifying availability of your home dir (/home1/02626/nhazekam)...OK
--> Verifying availability of your work dir (/work/02626/nhazekam)...OK
--> Verifying availability of your scratch dir (/scratch/02626/nhazekam)...OK
--> Verifying valid ssh keys...OK
--> Verifying access to desired queue (development)...OK
--> Verifying job request is within current queue limits...OK
--> Checking available allocation (TG-CHE140058)...OK
Submitted batch job 5536202
Use the squeue command to observe that they are submitted to SLURM:
$ squeue -u nhazekam
JOBID   PARTITION NAME     USER      ST  TIME NODES NODELIST(REASON)
5536266 normal    wqWorker nhazekam  R   0:22 1     c448-904
Now, restart your Makeflow and it will use the workers already running in SLURM:
$ makeflow -c example.makeflow
$ makeflow -T wq example.makeflow -N MYPROJECT-$USER
listening for workers on port XXXX....
You can leave the workers running there, if you want to start another Makeflow. They will remain until they have been idle for fifteen minutes, then will stop automatically.

If you add the -d all option to Makeflow, it will display debugging information that shows where each task was sent, when it was returned, and so forth:

$ makeflow -c example.makeflow
$ makeflow -T wq example.makeflow -N MYPROJECT-$USER -d all
listening for workers on port XXXX.

Running Workers in TORQUE

Of course, we don't really want to run workers on the head node, so let's instead start one worker using TORQUE:
-p "-W x=FLAGS:ADVRES:1467960506" 1
Creating worker submit scripts in xdtr16-workers...
2632653.gordon-fe2.local
Use the qstat command to observe that they are submitted to TORQUE:
$ qstat -u xdtr16
gordon-fe2.local: 
                                                                      Req'd    Req'd     Elap
Job ID                  Username Queue  Jobname    SessID  NDS   TSK  Memory   Time    S   Time
----------------------- -------- ------ ---------- ------ ----- ----- ------ --------- - ---------
2632653.gordon-fe2.loc  xdtr16   normal worker.sh    --     1     4     --    01:00:00 R  00:00:01
Now, restart your Makeflow and it will use the workers already running in TORQUE:
$ makeflow -c example.makeflow
$ makeflow -T wq example.makeflow -N MYPROJECT-TEST
listening for workers on port XXXX.
...
You can leave the workers running there, if you want to start another Makeflow. They will remain until they have been idle for fifteen minutes, then will stop automatically. If you add the -d all option to Makeflow, it will display debugging information that shows where each task was sent, when it was returned, and so forth:
$ makeflow -c example.makeflow
$ makeflow -T wq example.makeflow -N MYPROJECT-TEST -d all
listening for workers on port XXXX.