Makeflow Tutorial

Download and Installation

First log into Atmosphere Cloud and start up and instance using the image CCTools_7.0.19 v1.0.

Once the instance is running, login to it using ssh, Putty, or a similar tool.

This image already has CCTools installed. When you have a shell, set your environment as follows:

export CCTOOLS_HOME=/opt/cctools-7.0.19-x86_64-centos7
export PATH=${CCTOOLS_HOME}/bin:$PATH
export PYTHONPATH=${CCTOOLS_HOME}/lib/python2.7/site-packages:${PYTHONPATH}
export TCP_LOW_PORT=6000
export TCP_HIGH_PORT=6100

You can test your PATH setup by doing:

makeflow -v

makeflow version 7.0.19 FINAL (released 2019-10-18 08:42:56 -0400)
        Built by btovar@disc11.crc.nd.edu on 2019-10-18 08:42:56 -0400
        System: Linux disc11.crc.nd.edu 3.10.0-1062.1.1.el7.x86_64 #1 SMP Tue Aug 13 18:39:59 ...
        Configuration: --strict --with-cvmfs-path /opt/vc3/cctools-deps/cvmfs --with-ext2fs-path ...

which should print version information of CCTools. Similarly, you can test your PYTHONPATH setup with:

python -c 'import work_queue'

We recommend adding the modifications to these environment variables to your ~/.bashrc file.

If you are using another image, you will need to first download and unpack CCTools. For example, to install at your home directory:

# Do this only if the steps above failed because you are not using
# the image provided for this tutorial.
#
cd ${HOME}

wget http://ccl.cse.nd.edu/software/files/cctools-7.0.19-x86_64-centos7.tar.gz
tar xvzf cctools-7.0.19-x86_64-centos7.tar.gz
mv cctools-7.0.19-x86_64-centos7 cctools

export CCTOOLS_HOME=${HOME}/cctools
export PATH=${CCTOOLS_HOME}/bin:$PATH
export PYTHONPATH=${CCTOOLS_HOME}/lib/python2.7/site-packages:${PYTHONPATH}
export TCP_LOW_PORT=6000
export TCP_HIGH_PORT=6100

Makeflow Example

Let's being by using Makeflow to run a handful of simulation codes. First, make and enter a clean directory to work in:

cd $HOME
mkdir tutorial
cd tutorial

Now, download this program, which performs a highly sophisticated mock simulation of black holes colliding together:

wget http://ccl.cse.nd.edu/software/tutorials/acic19/simulation.py
chmod 755 simulation.py

Try running it once, just to see what it does:

./simulation.py 5

Now, let's use Makeflow to run several simulations. Create a file called example.makeflow and paste the following text into it:

input.txt:
	LOCAL echo "Hello Makeflow!" > input.txt

output.1: simulation.py input.txt
	./simulation.py 1 < input.txt > output.1

output.2: simulation.py input.txt
	./simulation.py 2 < input.txt > output.2

output.3: simulation.py input.txt
	./simulation.py 3 < input.txt > output.3

output.4: simulation.py input.txt
	./simulation.py 4 < input.txt > output.4

To run it on your local machine, one job at a time:

makeflow example.makeflow -j 1

Note that if you run it a second time, nothing will happen, because all of the files are built:

parsing example.makeflow...
local resources: 1 cores, 3789 MB memory, 15842 MB disk
max running local jobs: 1
checking example.makeflow for consistency...
example.makeflow has 5 rules.
recovering from log file example.makeflow.makeflowlog...
starting workflow....
nothing left to do.

Use the -c option to clean everything up before trying it again:

makeflow -c example.makeflow

Running Makeflow with Work Queue

Work Queue allows Makeflow to function as a master process that quickly dispatches work to remote worker processes.

makeflow -c example.makeflow
makeflow -T wq example.makeflow -p 0
listening for workers on port XXXX.
...

Now open up another shell (e.g., with ssh to your instance again), and run a single worker process:

# You may need to reset your PATH:
export CCTOOLS_HOME=/opt/cctools-7.0.19-x86_64-centos7
export PATH=${CCTOOLS_HOME}/bin:$PATH

work_queue_worker localhost XXXX

Go back to your first shell and observe that the makeflow has finished. Of course, remembering port numbers all the time gets old fast, so try the same thing again, but using a project name:

makeflow -c example.makeflow
makeflow -T wq example.makeflow -M MYPROJECT-${USER}
listening for workers on port XXXX
...

Now open up another shell and run your worker with a project name:

work_queue_worker -M MYPROJECT-${USER}

JX

Using the same makeflow as above, we're going to write the makelfow using JX. Create a file called example.jx and paste the following text into it:

{
"rules":[
         {
            "outputs":["input.txt"],
            "command":"echo \"Hello Makeflow!\" > input.txt",
            "local_job":true,
         },
         {
            "outputs":[format("output.%d",i)],
            "inputs":["simulation.py","input.txt"],
            "command":format("./simulation.py %d < input.txt > output.%d", i, i),
         } for i in range(1,5)
        ],
}

To run it on your local machine, one job at a time:

makeflow --jx example.jx -j 1

Use the -c option to clean everything up before trying it again:

makeflow -c --jx example.jx

You can use the tool jx2json to see the expanded rules:

jx2json --pretty example.jx


{
    ...
      {
        "outputs":
          [
            "output.1"
          ],
        "inputs":
          [
            "simulation.py",
            "input.txt"
          ],
        "command":"./simulation.py 1 < input.txt > output.1"
      },
      
      {
        "outputs":
          [
            "output.2"
          ],
  ...

Now to leverage JX's ability to be used as a template, we will modify the existing example.jx to use a variable range. Change this line:

    } for i in range(1,5),

to:

    } for i in range(1,LIMIT),

To define what this limit is we can either pass it as a commandline option:

makeflow --jx example.jx --jx-define "LIMIT"=5

--or-- We can write it into a arguments file, say args.jx that configures the makeflow:

{
    "LIMIT":5,
}

... and run it with:

makeflow --jx example.jx --jx-args args.jx

Resource Specifications

Now we want to specify resources using the Make syntax:

Add this to the top of your example.makeflow

# All the following tasks belong to the group 'analysis'
# and use the stated resources
.MAKEFLOW CATEGORY analysis
.MAKEFLOW CORES 1
.MAKEFLOW MEMORY 100
.MAKEFLOW DISK 200

Now run it on your local machine, letting the resouces determine concurrency:

makeflow example.makeflow

If you are using a medium size image, then there are 4 cores available, and since each rule uses one core, we will see four rules running concurrently.

By default, makeflow will try to use all the cores avavailable in the machine. You can control that with the option local-cores. Use the -c option to clean everything up before trying it again:

makeflow -c example.makeflow

To see how resources can be used to limit the concurrency at larger scales, we will "scale down" this example by changing the number of expected cores.

makeflow --local-cores=2 example.makeflow

We should see two tasks running concurrently.

Similarly, we can declare resources in jx. Create the file example_with_resources.jx with the following contents:

{
"categories":{
                "preprocess":{
                    "resources":{
                                 "cores":4, "memory":200, "disk":500
                                }
                },
                "analysis":{
                    "resources":{
                                 "cores":2, "memory":100, "disk":250
                                }
                }
},
"rules":[
         {
            "category":"preprocess",
            "outputs":["input.txt"],
            "command":"echo \"Hello Makeflow!\" > input.txt",
            "local_job":true,
         },
         {
            "category":"analysis",
            "outputs":[format("output.%d",i)],
            "inputs":["simulation.py","input.txt"],
            "command":format("./simulation.py %d < input.txt > output.%d", i, i),
         } for i in range(1,5),
        ],
}

makeflow --local-cores=4 --jx example_with_resources.jx

As we will see Thursday, resource declarations are communicated to Work Queue, and workers automatically run as many tasks as they can fit.

Visualization

The first way that we will visualize the Makeflow is by looking at the shape of the DAG:

makeflow_viz example.makeflow -D dot > example_makeflow_viz.dot

# For the next step, you need graphviz installed:
dot example_makeflow_viz.dot -Tpng > example_makeflow_viz.png

It should look something like this:

The next way that we will visualize the Makeflow is by looking at the results of the Makeflow:

# For the next step, you need gnuplot installed:

makeflow_graph_log example.makeflowlog example_graph_log.png

Note: The *.makeflowlog file is created automatically on any run. To name this specifically the -l option is used.

It should look something like this, though much smaller:

Optional: Running on PBS

Arizona HPC uses the PBS batch system, so to run your jobs through PBS, do this: (Note that the -B option passes a bunch of details required by UA HPC.)

# This won't work from your virtual machine! Try this at the cluster frontend.
# Modify group_list with your username

makeflow -T pbs -B "-W group_list=ericlyons -q windfall -ljobtype=serial -lselect=1:ncpus=1:mem=1gb -lpack:shared" example.makeflow

If you are working at another site that uses Condor or Torque or SGE, then you would invoke makeflow like this instead:

makeflow -T condor example.makeflow
makeflow -T torque example.makeflow
makeflow -T sge example.makeflow

firewall limitations, but the same technique works at other sites.

For work queue, we don't really want to run workers on the head node, so let's instead use pbs_submit_workers to set up five workers in PBS for us:

pbs_submit_workers -p"-W group_list=ericlyons -q windfall -ljobtype=serial -lselect=1:ncpus=1:mem=1gb -lpack:shared" IP-OF-MASTER PORT 5
Creating worker submit scripts in dthain-workers...
2065026.i136
2065027.i136
2065028.i136
2065029.i136
2065030.i136

Use the qstat command to observe that they are submitted to PBS:

qstat
2065027.i136               worker.sh        dthain                 0 R batch
2065028.i136               worker.sh        dthain                 0 R batch
2065029.i136               worker.sh        dthain                 0 R batch 
2065030.i136               worker.sh        dthain                 0 R batch

Now, restart your Makeflow and it will use the workers already running in PBS:

makeflow -c example.makeflow
makeflow -T wq example.makeflow -M MYPROJECT
listening for workers on port XXXX.
...

You can leave the workers running there, if you want to start another Makeflow. They will remain until they have been idle for fifteen minutes, then will stop automatically.

If you add the -d all option to Makeflow, it will display debugging information that shows where each task was sent, when it was returned, and so forth:

makeflow -c example.makeflow
makeflow -T wq example.makeflow -M MYPROJECT -d all
listening for workers on port XXXX.

Homework

Now you have everything needed to do the Homework Assignment.