CCL | Software | Install | Manuals | Forum | Papers
CCL Home

Research

Software Community Operations

Tutorial: Introduction to scalable programming using Makeflow and Work Queue

Tutorial at the University of Notre Dame, October 24, 2012.

Location and Time

UPDATE: Due to high demand, we are offering another (repeat) session on November 7 in addition to the session on October 24. The tutorial will now be offered on the following days (both sessions will cover the same topics and material).
  • 3:00pm to 5:00pm on October 24, 2012 at 303 Cushing Hall
  • 12:00pm to 2:00pm on November 7, 2012 at 303 Cushing Hall
  • Registration

    We have limited openings for this tutorial. Please register here to reserve your spot. There is no registration or attendance fee.

    Objective

    A large number of computing resources are available at our disposal due to easy and cheap accessibility to distributed systems such as clusters (ND CRC cluster), clouds (Amazon EC2, Google Compute Engine), and grids (Condor). As a result, large workflows can be run using multiple resources from such distributed systems to gain performance benefits at low costs.

    For example, consider the problem of protein folding, a critical and challenging problem in biology. It is a large data-intensive workflow involving hundreds of thousands of simulations. By aggregating and harnessing resources from multiple distributed systems, the time required to fully simulate and study the process was reduced from 40 years to a few weeks (described here)!

    Would you like to learn how you can similarly harness hundreds or thousands of low-cost distributed machines with minimal effort in your research? Or how to write portable programs that will run on different distributed systems including grids, clouds, and any future systems that may come along?

    This tutorial will provide an introduction to writing scalable programs that can harness resources from different distributed systems. We will use Makeflow and Work Queue, developed by the Cooperative Computing Lab, to build such programs. These tools are used at Notre Dame and around the world to attack large problems in fields such as chemistry, data mining, economics, biology, physics, and more.

    Target Audience

    This tutorial is appropriate for new graduate students, research staff, undergraduate researchers, and participants interested in:

  • learning to write programs that run on distributed systems and
  • designing scalable programs that can aggregate and harness resources across multiple distributed systems.
  • Prerequisites

  • Access to the CRC cluster at Notre Dame. If you don't have an account, you can get one by following the instructions here.
  • Familiarity with Unix.
  • Ability to program in Python, Perl, or C (for the Work Queue portion of tutorial).
  • Tutorial Materials

    The tutorial will consist of a lecture and a hands-on instruction in a computer equipped classroom. The lecture will give an overview of the Makeflow and Work Queue software. The hands-on instruction will demonstrate and walk through the installation and use of Work Queue and Makeflow to write scalable programs. A set of practice problems will also be provided to try on completion of the tutorial.

      Makeflow

  • Lecture slides
  • Tutorial
  • Practice problems
  •   Work Queue

  • Lecture slides
  • Tutorial
  • Practice problems
  • Software Resources

  • Software Download
  • Makeflow Web Page
  • Work Queue Web Page
  • Getting Help
  • Additional Reading

    (Showing papers with tag css2012. See all papers instead.)