Tutorial: Introduction to scalable programming using Makeflow and Work Queue

CCL Home

Research

Software

Community

Operations

Tutorial: Introduction to scalable programming using Makeflow and Work Queue

Tutorial at the University of Notre Dame, October 24, 2012.

Location and Time

UPDATE: Due to high demand, we are offering another (repeat) session on November 7 in addition to the session on October 24. The tutorial will now be offered on the following days (both sessions will cover the same topics and material).
3:00pm to 5:00pm on October 24, 2012 at 303 Cushing Hall

12:00pm to 2:00pm on November 7, 2012 at 303 Cushing Hall

Registration

We have limited openings for this tutorial. Please register here to reserve your spot. There is no registration or attendance fee.

Objective

A large number of computing resources are available at our disposal due to easy and cheap accessibility to distributed systems such as clusters (ND CRC cluster), clouds (Amazon EC2, Google Compute Engine), and grids (Condor). As a result, large workflows can be run using multiple resources from such distributed systems to gain performance benefits at low costs.

For example, consider the problem of protein folding, a critical and challenging problem in biology. It is a large data-intensive workflow involving hundreds of thousands of simulations. By aggregating and harnessing resources from multiple distributed systems, the time required to fully simulate and study the process was reduced from 40 years to a few weeks (described here)!

Would you like to learn how you can similarly harness hundreds or thousands of low-cost distributed machines with minimal effort in your research? Or how to write portable programs that will run on different distributed systems including grids, clouds, and any future systems that may come along?

This tutorial will provide an introduction to writing scalable programs that can harness resources from different distributed systems. We will use Makeflow and Work Queue, developed by the Cooperative Computing Lab, to build such programs. These tools are used at Notre Dame and around the world to attack large problems in fields such as chemistry, data mining, economics, biology, physics, and more.

Target Audience

This tutorial is appropriate for new graduate students, research staff, undergraduate researchers, and participants interested in:

learning to write programs that run on distributed systems and

designing scalable programs that can aggregate and harness resources across multiple distributed systems.

Prerequisites

Access to the CRC cluster at Notre Dame. If you don't have an account, you can get one by following the instructions here.

Familiarity with Unix.

Ability to program in Python, Perl, or C (for the Work Queue portion of tutorial).

Tutorial Materials

The tutorial will consist of a lecture and a hands-on instruction in a computer equipped classroom. The lecture will give an overview of the Makeflow and Work Queue software. The hands-on instruction will demonstrate and walk through the installation and use of Work Queue and Makeflow to write scalable programs. A set of practice problems will also be provided to try on completion of the tutorial.

Makeflow

Lecture slides

Tutorial

Practice problems

Work Queue

Lecture slides

Tutorial

Practice problems

Software Resources

Additional Reading

(Showing papers with tag css2012. See all papers instead.)

Christopher Moretti, Andrew Thrasher, Li Yu, Michael Olson, Scott Emrich, and Douglas Thain,
A Framework for Scalable Genome Assembly on Clusters, Clouds, and Grids,
IEEE Transactions on Parallel and Distributed Systems, 23(12), December, 2012. DOI: 10.1109/TPDS.2012.80

Badi Abdul-Wahid, Li Yu, Dinesh Rajan, Haoyun Feng, Eric Darve, Douglas Thain, Jesus A. Izaguirre,
Folding Proteins at 500 ns/hour with Work Queue,
8th IEEE International Conference on eScience (eScience 2012), October, 2012. DOI: 10.1109/eScience.2012.6404429

Michael Albrecht, Patrick Donnelly, Peter Bui, and Douglas Thain,
Makeflow: A Portable Abstraction for Data Intensive Computing on Clusters, Clouds, and Grids,
Workshop on Scalable Workflow Enactment Engines and Technologies (SWEET) at ACM SIGMOD, May, 2012. DOI: 10.1145/2443416.2443417

Dinesh Rajan, Anthony Canino, Jesus A Izaguirre, and Douglas Thain,
Converting a High Performance Application to an Elastic Cloud Application,
The 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011), November, 2011.

Peter Bui, Dinesh Rajan, Badi Abdul-Wahid, Jesus Izaguirre, Douglas Thain,
Work Queue + Python: A Framework For Scalable Scientific Ensemble Applications,
Workshop on Python for High Performance and Scientific Computing (PyHPC) at the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing) , November, 2011.

Andrew Thrasher, Rory Carmichael, Peter Bui, Li Yu, Douglas Thain, and Scott Emrich,
Taming Complex Bioinformatics Workflows with Weaver, Makeflow, and Starch,
Workshop on Workflows in Support of Large Scale Science, pages 1-6, November, 2010. DOI: 10.1109/WORKS.2010.5671858