Work Queue logo

Work Queue is a framework for building large distributed applications that span thousands of machines drawn from clusters, clouds, and grids. Work Queue applications are written in Python, Perl, or C using a simple API that allows users to define tasks, submit them to the queue, and wait for completion. Tasks are executed by a general worker process that can run on any available machine. Each worker calls home to the manager process, arranges for data transfer, and executes the tasks. A wide variety of scheduling and resource management features are provided to enable the efficient use of large fleets of multicore servers. The system handles a wide variety of failures, allowing for dynamically scalable and robust applications.

Who Uses Work Queue?

Work Queue has been used to write applications that scale from a handful of workstations up to tens of thousands of cores running on supercomputers. Examples include the Parsl workflow system, the Coffea analysis framework, the Makeflow workflow engine, SHADHO, Lobster, NanoReactors, ForceBalance, Accelerated Weighted Ensemble, the SAND genome assembler, and the All-Pairs and Wavefront abstractions.

The framework is easy to use, and has been used to teach courses in parallel computing, cloud computing, distributed computing, and cyberinfrastructure at the University of Notre Dame, the University of Arizona, the University of Wisconsin, and many other locations.

Video Introduction to Work Queue

Related Publications

  1. Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy Physics
    Ben Tovar, Ben Lyons, Kelci Mohrman, Barry Sly-Delgado, Kevin Lannon, and Douglas Thain
    In IEEE International Parallel and Distributed Processing Symposium, 2022
    doi: 10.1109/IPDPS53621.2022.00041
  2. Not All Tasks Are Created Equal: Adaptive Resource Allocation for Heterogeneous Tasks in Dynamic Workflows
    Thanh Son Phung, Logan Ward, Kyle Chard, and Douglas Thain
    In WORKS Workshop on Workflows at Supercomputing, 2021
  3. Lightweight Function Monitors for Fine-Grained Management in Large Scale Python Applications
    Tim Shaffer, Zhuozhao Li, Ben Tovar, Yadu Babuji, TJ Dasso, Zoe Surma, Kyle Chard, Ian Foster, and Douglas Thain
    In IEEE International Parallel and Distributed Processing Symposium, 2021
    doi: 10.1109/IPDPS49936.2021.00088
  4. Harnessing HPC resources for CMS jobs using a Virtual Private Network
    Benjamin Tovar, Brian Bockelman, Michael Hildreth, Kevin Lannon, and Douglas Thain
    In 25th International Conference on Computing in High Energy and Nuclear Physics (CHEP), 2021
    doi: 10.1051/epjconf/202125102032
  5. Autoscaling High Throughput Workloads on Container Orchestrators
    Chao Zheng, Nathaniel Kremer-Herman, Tim Shaffer, and Douglas Thain
    In IEEE Conference on Cluster Computing, 2020
    doi: 10.1109/CLUSTER49012.2020.00024
  6. Dynamic Sizing of Continuously Divisible Jobs for Heterogeneous Resources
    Nick Hazekamp, Ben Tovar, and Douglas Thain
    In IEEE International Conference on e-Science, 2019
    doi: 10.1109/eScience.2019.00026
  7. A Lightweight Model for Right-Sizing Master-Worker Applications
    Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain
    In ACM/IEEE Supercomputing (SC), 2018
    doi: 10.1109/SC.2018.00042
  8. MAKER as a Service: Moving HPC applications to Jetstream Cloud
    Nicholas Hazekamp, Upendra Kumar Devisetty, Nirav Merchant, and Douglas Thain
    In IEEE International Conference on Cloud Engineering, 2018
    doi: 10.1109/IC2E.2018.00029
  9. SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
    Jeffrey Kinnison, Nathaniel Kremer-Herman, Douglas Thain, and Walter Scheirer
    In IEEE Winter Conference on Applications of Computer Vision, 2018
    doi: 10.1109/WACV.2018.00086
  10. A Job Sizing Strategy for High-Throughput Scientific Workflows
    Benjamin Tovar, Rafael Ferreira Silva, Gideon Juve, Ewa Deelman, William Allcock, Douglas Thain, and Miron Livny
    IEEE Transactions on Parallel and Distributed Systems, 2018
    doi: 10.1109/TPDS.2017.2762310
  11. Towards Scalable and Dynamic Social Sensing Using A Distributed Computing Framework
    Daniel (Yue) Zhang, Charles (Chao) Zheng, Dong Wang, Doug Thain, Chao Huang, Xin Mu, and Greg Madey
    In The 37th IEEE International Conference on Distributed Computing Systems (ICDCS 2017), 2017
    doi: 10.1109/ICDCS.2017.196
  12. Designing Self-Tuning Split-Map-Merge Applications for High Cost-Efficiency in the Cloud
    Dinesh Rajan and Douglas Thain
    IEEE Transactions on Cloud Computing, 2017
    doi: 10.1109/TCC.2015.2415780
  13. PRUNE: A Preserving Run Environment for Reproducible Computing
    Peter Ivie and Douglas Thain
    In IEEE Conference on e-Science, 2016
    doi: 10.1109/eScience.2016.7870886
  14. Scaling Up a CMS Tier-3 Site with Campus Resources and a 100Gb/s Network Connection: What Could Go Wrong?
    Matthias Wolf, Anna Woodard, Wenzhao Li, Kenyi Hurtado Anampa, Benjamin Tovar, Paul Brenner, Kevin Lannon, Mike Hildreth, and Douglas Thain
    In International Conference on Computing in High Energy Physics, 2016
    doi: 10.1088/1742-6596/898/8/082041
  15. Scaling Data Intensive Physics Applications to 10k Cores on Non-Dedicated Clusters with Lobster
    Anna Woodard, Matthias Wolf, Charles Mueller, Nil Valls, Ben Tovar, Patrick Donnelly, Peter Ivie, Kenyi Hurtado Anampa, Paul Brenner, Douglas Thain, Kevin Lannon, and Michael Hildreth
    In IEEE Conference on Cluster Computing, 2015
  16. Integrating Containers into Workflows: A Case Study Using Makeflow, Work Queue, and Docker
    Charles (Chao) Zheng and Douglas Thain
    In Workshop on Virtualization Technologies in Distributed Computing (VTDC), 2015
    doi: 10.1145/2755979.2755984
  17. Exploiting Volatile Opportunistic Computing Resources with Lobster
    Anna Woodard, Matthias Wolf, Charles Nicholas Mueller, Ben Tovar, Patrick Donnelly, Kenyi Hurtado Anampa, Paul Brenner, Kevin Lannon, and Michael Hildreth
    In Computing in High Energy Physics, 2015
  18. AWE-WQ: Fast-Forwarding Molecular Dynamics using the Accelerated Weighted Ensemble
    Badi Abdul-Wahid, Haoyun Feng, Dinesh Rajan, Ronan Costaouec, Eric Darve, Douglas Thain, and Jesus A. Izaguirre
    Journal of Chemical Information and Modeling, 2014
    doi: 10.1021/ci500321g
  19. Scaling Up Genome Annotation with MAKER and Work Queue
    Andrew Thrasher, Zachary Musgrave, Brian Kachmark, Douglas Thain, and Scott Emrich
    International Journal of Bioinformatics Research and Applications, 2014
    doi: 10.1504/IJBRA.2014.062994
  20. Accelerating Comparative Genomics Workflows in a Distributed Environment with Optimized Data Partitioning
    Olivia Choudhury, Nicholas L. Hazekamp, Douglas Thain, and Scott Emrich
    In C4BIO Workshop at IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2014
  21. Making Work Queue Cluster-Friendly for Data Intensive Scientific Applications
    Michael Albrecht, Dinesh Rajan, and Douglas Thain
    In IEEE International Conference on Cluster Computing, 2013
    doi: 10.1109/CLUSTER.2013.6702628
  22. Case Studies in Designing Elastic Applications
    Dinesh Rajan, Andrew Thrasher, Badi Abdul-Wahid, Jesus A Izaguirre, Scott Emrich, and Douglas Thain
    In 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2013
    doi: 0.1109/CCGrid.2013.46
  23. A Framework for Scalable Genome Assembly on Clusters, Clouds, and Grids
    Christopher Moretti, Andrew Thrasher, Li Yu, Michael Olson, Scott Emrich, and Douglas Thain
    IEEE Transactions on Parallel and Distributed Systems, 2012
    doi: 10.1109/TPDS.2012.80
  24. Folding Proteins at 500 ns/hour with Work Queue
    Badi Abdul-Wahid, Li Yu, Dinesh Rajan, Haoyun Feng, Eric Darve, Douglas Thain, and Jesus A. Izaguirre
    In 8th IEEE International Conference on eScience (eScience 2012), 2012
    doi: 10.1109/eScience.2012.6404429
  25. Shifting the Bioinformatics Computing Paradigm: A Case Study in Parallelizing Genome Annotation Using Maker and Work Queue
    Andrew Thrasher, Zachary Musgrave, Douglas Thain, and Scott Emrich
    In IEEE International Conference on Computational Advances in Bio and Medical Sciences, 2012
  26. Converting a High Performance Application to an Elastic Cloud Application
    Dinesh Rajan, Anthony Canino, Jesus A Izaguirre, and Douglas Thain
    In The 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011), 2011
  27. Work Queue + Python: A Framework For Scalable Scientific Ensemble Applications
    Peter Bui, Dinesh Rajan, Badi Abdul-Wahid, Jesus Izaguirre, and Douglas Thain
    In Workshop on Python for High Performance and Scientific Computing (PyHPC) at the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing) , 2011
  28. Adapting Bioinformatics Applications for Heterogeneous Systems: A Case Study
    Irena Lanc, Peter Bui, Douglas Thain, and Scott Emrich
    In Emerging Computational Methods for the Life Sciences Workshop at ACM HPDC, 2011
    doi: 10.1145/1996023.1996025
  29. Harnessing Parallelism in Multicore Clusters with the All-Pairs, Wavefront, and Makeflow Abstractions
    Li Yu, Christopher Moretti, Andrew Thrasher, Scott Emrich, Kenneth Judd, and Douglas Thain
    Journal of Cluster Computing, 2010
    doi: 10.1007/s10586-010-0134-7
  30. Abstractions for Cloud Computing with Condor
    Douglas Thain and Christopher Moretti
    In Cloud Computing and Software Services: Theory and Techniques, 2010
    isbn: 9781439803158
  31. Weaver: Integrating Distributed Computing Abstractions into Scientific Workflows using Python
    Peter Bui, Li Yu, and Douglas Thain
    In Challenges of Large Applications in Distributed Environments at ACM HPDC 2010, 2010
    doi: 10.1145/1851476.1851570
  32. Scalable Modular Genome Assembly on Campus Grids
    Christopher Moretti, Michael Olson, Scott Emrich, and Douglas Thain
    2009
  33. Harnessing Parallelism in Multicore Clusters with the All-Pairs and Wavefront Abstractions
    Li Yu, Christopher Moretti, Scott Emrich, Kenneth Judd, and Douglas Thain
    In IEEE High Performance Distributed Computing, 2009
    doi: 10.1145/1551609.1551613