CCL Home Software Community Operations |
Applications DemonstrationDuring this demo, we will take you through some distributed computation examples using Makeflow and Work Queue.
IntroductionTo demonstrate a scalable scientific application written using Makeflow and Work Queue we are going to use a bioinformatics application BWA and replica exchange, a proteomics application. The BWA or the Burrows-Wheeler Alignment algorithm compares and matches a set of test sequences against reference databases of sequences. This algorithm is implemented in Makeflow and can be found here. The replica exchange algorithm is used to sample the energy states of the protein molecule using simulations. The Python script for the replica exchange application is included in the CCTools source that can be downloaded here. GoalsIn this demonstration of the scalable scientific applications implemented usign Makeflow and Work Queue, we intend to show the following:
Demo: BWA - Makeflow-based Bioinformatics ApplicationWe have a BWA Makeflow running using Work Queue under the project name bwa-xsede on a remote resource in the Lonestar cluster. We are now going to submit workers for this BWA application running under the project name bwa-xsede. First, be sure to be logged in to the head node of XSEDE's Lonestar where you will submit workers for the BWA application. ssh USERNAME@login1.ls4.tacc.utexas.edu
Please replace USERNAME with the username in the sheet of paper handed out to you. Use the password given there as well.
NOTE: Please feel free to use your own XSEDE account if you have one. SetupMake sure that your CCTools directory is added to your $PATH: export PATH=~/cctools/bin:${PATH}
Check to see if the setup was correctly configured: $ work_queue_worker -v
work_queue_worker version 4.0.0rc1-TRUNK (released 07/16/2013)
Built by dpandiar@login2.ls4.tacc.utexas.edu on Jul 16 2013 at 16:30:12
System: Linux login2.ls4.tacc.utexas.edu 2.6.18-194.32.1.el5_TACC #2 SMP
Configuration: --prefix /home1/02486/dpandiar/cctools
Submit Multi-slot Work Queue Workers using SGEWe are going to submit multi-slot workers to the Lonestar SGE that utilizes 4 cores on its allocated node. To do this, we will use the sge_submit_workers script installed in the CCTools directory. sge_submit_workers creates the worker script to be executed and submit this script to SGE for the specified number of workers using qsub: sge_submit_workers -p "-q development -l h_rt=01:00:00 -V -pe 12way 12" --cores 0 -M bwa-xsede 15
The above command submits 15 workers for execution on Lonestar's cluster. Each submitted worker when executed will contact the Catalog Server to determine the location of the master with the project name bwa-xsede. They will then attempt a connection to the master at the location resolved by the Catalog Server. On connection, the master will then start dispatching tasks to the connected workers. To monitor the status of your submitted workers jobs, you can watch their status SGE's qstat interface: watch "qstat -u $USER"
You can also monitor the run-time status of the BWA application in terms of the number of tasks completed, number of workers connected, etc., using the work_queue_status tool installed in your CCTools directory. work_queue_status
You will see an output similar to the one below: $ work_queue_status
PROJECT HOST PORT WAITING BUSY COMPLETE WORKERS
4206.blast.biocomp cclweb03.cse.nd.edu 9001 0 1 0 4
bwa-xsede d6copt007.crc.nd.edu 1024 0 1 0 15
one_two_many_lots d6copt129.crc.nd.edu 9000 399 600 3028 600
nr_blast helix.cse.nd.edu 9694 970 30 16 30
forcebalance not0rious.Stanford.EDU 9238 0 54 216 54
numbtest opteron.crc.nd.edu 9123 0 0 1 0
cclosdci workspace.crc.nd.edu 9907 0 0 41 1
You can also monitor the status of just the BWA application as follows: $ work_queue_status | grep bwa-xsede
bwa-xsede d6copt007.crc.nd.edu 1024 0 1 0 15
Demo: Replica Exchange - A Work Queue-based Proteomics ApplicationWe have a replica exchange running using Work Queue with a foreman under the project name replica-foreman on a remote resource in the Lonestar cluster. Submitting workers to this forman is similar to above, except the project name is replica-foreman. Submit Work Queue Workers to a ForemanFirst, remove any prior jobs submitted to SGE so we start with a clean slate. qdel -u $USER
Here, we are going to submit workers to the Lonestar SGE that will connect to a foreman we have set up. Again, we will use the sge_submit_workers script installed in the CCTools directory: qsub: sge_submit_workers -p "-q development -l h_rt=01:00:00 -V -pe 12way 12" --cores 0 -M replica-foreman 15
The above command submits 15 workers for execution on Lonestar's cluster. Each of these worker will contact the foreman identified as replica-foreman. This foreman will then communicate with the project master. If we had workers in other clusters, we can have a foreman for each cluster, thus reducing the network traffic flowing through wide area networks. As before, you can monitor the status of the application by using the work_queue_status tool installed in your CCTools directory. work_queue_status
|