| CCL Home Software Community Operations | Makeflow TutorialThis tutorial will have you install CCTools into your CRC's home directory and will take you through some distributed computation examples using Makeflow. Getting StartedLogin to the CRC's Head NodeFor this tutorial, we assume you have SSH access to 
	Notre Dame CRC login nodes. 
 NOTE: If you are part of Notre Dame and do NOT have a CRC account, you can do the tutorial on cclsubmit00.cse.nd.edu and using Condor. ssh cclsubmit00.cse.nd.eduFurther instructions for running on cclsubmit00.cse.nd.edu and Condor are below. NOTE: If you are NOT part of Notre Dame, you can still do most of the tutorial on a local Linux machine, including everything except the section where workers are submitted to CRC/Condor. Download, Build, and Install CCToolsNavigate to the download page in your browser to review the most recent versions: http://www.nd.edu/~ccl/software/download.shtml Download a copy of CCTools 4.0.2wget http://www.nd.edu/~ccl/software/files/cctools-4.0.2-source.tar.gz
tar xzf cctools-4.0.2-source.tar.gzBuild and Install CCToolscd ~/cctools-4.0.2-source
./configure --prefix ~/cctools && make installSet Environment VariablesYou will need to add your CCTools directory to your $PATH: setenv PATH "${HOME}/cctools/bin:${PATH}"NOTE for non-CRC users: You will need to set the port ranges to use on cclsubmit00.cse.nd.edu. setenv WORK_QUEUE_LOW_PORT 9100
setenv WORK_QUEUE_HIGH_PORT 9500Makeflow ExampleSetupSetup a Sandbox for this TutorialIf you have CRC account, set the sandbox in your home directory. SGE has access to your home directory in the shared filesystem. mkdir ~/cctools-tutorial
cd ~/cctools-tutorial Note for non-CRC users: Create the sandbox in /tmp. This is because Condor does not have permissions to access your home directory. mkdir /tmp/cctools-tutorial-$USER
cd /tmp/cctools-tutorial-$USERCreate a sub-directory for storing Makeflow files. mkdir makeflow-simple
cd makeflow-simpleDownload simulation.py which is our application executable for this exercise. Download this Makeflow script which defines the workflow. wget http://www.nd.edu/~ccl/software/tutorials/ndtut13/simple/simulation.py
wget http://www.nd.edu/~ccl/software/tutorials/ndtut13/simple/MakeflowThe Makeflow script should look like: $ cat Makeflow
input.txt:
	LOCAL /bin/echo Hello World > input.txt
A: simulation.py input.txt
	python simulation.py 1 < input.txt > A
B: simulation.py input.txt
	python simulation.py 2 < input.txt > B
C: simulation.py input.txt
	python simulation.py 3 < input.txt > C
D: simulation.py input.txt
	python simulation.py 4 < input.txt > DRunning with Local (Multiprocess) ExecutionHere we're going to tell Makeflow to dispatch the jobs using regular local processes (no distributed computing!). This is basically the same as regular Unix Make using the -j flag. makeflow -T localIf everything worked out correctly, you should see: $ makeflow -T local
/bin/echo Hello World > input.txt
python simulation.py 4 < input.txt > D
python simulation.py 3 < input.txt > C
python simulation.py 2 < input.txt > B
python simulation.py 1 < input.txt > A
nothing left to do.Running on a batch systemIf you have a CRC account: The following code tells Makeflow to dispatch jobs using the SGE batch submission system (qsub, qdel, qstat, etc.). makeflow -T sgeNote for non-CRC users: Have Makeflow dispatch jobs to the Condor batch submission system (condor_submit, condor_rm, condor_q, etc.). makeflow -T condorYou will get as output: nothing left to do.Well... that's not right. Nothing was run! We need to clean out the generated output files and logs so Makeflow starts from a clean slate again: makeflow -cWe see it deleted the files we generated in the last run: $ makeflow -c
deleted file D
deleted file C
deleted file B
deleted file A
deleted file input.txt
deleted file ./Makeflow.makeflowlogNow let's try again: makeflow -T sgeNote for non-CRC users: Use Condor instead. makeflow -T condorWe get the output we expect: /bin/echo Hello World > input.txt
python simulation.py 4 < input.txt > D
python simulation.py 3 < input.txt > C
python simulation.py 2 < input.txt > B
python simulation.py 1 < input.txt > A
nothing left to do.Notice that the output is no different from using local execution. Makeflow is built to be execution engine agnostic. There is no difference between executing the task locally or remotely. In this case, we can confirm that the job was run on another host by looking at the output produced by the simulation: $ cat D
Running on host d6copt096.crc.nd.edu
Starting 2013 30 Sep 18:01:14
x = 2.0
x = 1.41421356237
x = 1.189207115
x = 1.09050773267
x = 1.04427378243
Finished 2013 30 Sep 18:01:19Here we see that the task ran on node d6copt096.crc.nd.edu. It took 5 seconds to complete. Running Makeflow with Work QueueThe submission and wait times for the Makeflow tasks in the above case will vary because of the latencies in the underlying batch job submission platform (SGE). To avoid long submission and wait times, Makeflow can be run using Work Queue. Work Queue excels at handling low latency and short turn-around time jobs. Here, we will start Makeflow which will setup a Work Queue master on an arbitrary port using -p 0. makeflow -c
makeflow -T wq -p 0You should see output like this: listening for workers on port 1024.
/bin/echo Hello World > input.txt
python simulation.py 4 < input.txt > D
python simulation.py 3 < input.txt > C
python simulation.py 2 < input.txt > B
python simulation.py 1 < input.txt > A
To start a work_queue_worker for this master, open another terminal window and login into corresponding login node: ssh newcell.crc.nd.eduNote for non-CRC users: Login to cclsubmit00.cse.nd.edu ssh cclsubmit00.cse.nd.eduAdd the CCTools directory to our $PATH as before: setenv PATH "${HOME}/cctools/bin:${PATH}"Now, start a work_queue_worker for this Makeflow by giving it the port the master is listening on. Let us also enable the debugging output using -d all. work_queue_worker -t 10 -d all localhost XXXX
...replacing XXXX with the port the Makeflow master is listening on. When the tasks are finished, the worker should quit due to the 10 second timeout. Note that the Makeflow still executed locally since the work_queue_worker was run on the same node (newcell or cclsubmit00, respectively) as the Makeflow master. The exercise below shows how to start several work_queue_worker processes on the SGE cluster. ExerciseRunning Makeflow with Work Queue Workers on SGE and condorThe goal of this exercise is to setup Work Queue workers on SGE and condor compute nodes. Here we submit the worker tasks using the sge_submit_workers and condor_submit_workers scripts. To know more about the options and arguments for these submission scripts, do: sge_submit_workers -hNote for non-CRC users: Use condor_submit_workers to submit workers to Condor. condor_submit_workers -hIn this exercise, we will also use the catalog server and a project name so workers can find the master without being provided with the master's hostname and port. We specify a project name for the Makeflow script using the -N option that takes a string as an argument. The workers are then provided with the project name to connect using the same -N option. makeflow -c
sge_submit_workers -t 300 -M $USER 2
makeflow -T wq -N $USER Note for non-CRC users: Submit workers to Condor. makeflow -c
condor_submit_workers -t 300 -M $USER 2
makeflow -T wq -N $USERThe $USER will subsitute your username. This means, your username will be used as the project name for your Makeflow. While Makeflow is running, you can look at the current Work Queue projects. $ work_queue_status
PROJECT            NAME                    PORT  WAITING  BUSY COMPLETE WORKERS 
[...]
nd-tutorial        newcell.crc.nd.edu      9001        4     0        0       0work_queue_status is a useful tool for checking on your various Work Queue projects. The output includes various data on the tasks and workers the Master is managing. Here we see there are 4 tasks currently but no workers are connected, yet. |