Work Queue is a framework for building large distributed applications that span thousands of machines drawn from clusters, clouds, and grids. Work Queue applications are written in Python, Perl, or C using a simple API that allows users to define tasks, submit them to the queue, and wait for completion. Tasks are executed by a general worker process that can run on any available machine. Each worker calls home to the manager process, arranges for data transfer, and executes the tasks. A wide variety of scheduling and resource management features are provided to enable the efficient use of large fleets of multicore servers. The system handles a wide variety of failures, allowing for dynamically scalable and robust applications.
The framework is easy to use, and has been used to teach courses in parallel computing, cloud computing, distributed computing, and cyberinfrastructure at the University of Notre Dame, the University of Arizona, the University of Wisconsin, and many other locations.
Video Introduction to Work Queue
Related Publications
Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy Physics
Ben Tovar, Ben Lyons, Kelci Mohrman, Barry Sly-Delgado, Kevin Lannon, and Douglas Thain
In IEEE International Parallel and Distributed Processing Symposium, 2022
@inproceedings{topeft-ipdps-2022,author={Tovar, Ben and Lyons, Ben and Mohrman, Kelci and Sly-Delgado, Barry and Lannon, Kevin and Thain, Douglas},title={{Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy Physics}},booktitle={{IEEE International Parallel and Distributed Processing Symposium}},year={2022},note={{doi: 10.1109/IPDPS53621.2022.00041}},cclpaperid={979},keywords={workqueue, hep},}
Not All Tasks Are Created Equal: Adaptive Resource Allocation for Heterogeneous Tasks in Dynamic Workflows
Thanh Son Phung, Logan Ward, Kyle Chard, and Douglas Thain
In WORKS Workshop on Workflows at Supercomputing, 2021
@inproceedings{tasks-works-2021,author={Phung, Thanh Son and Ward, Logan and Chard, Kyle and Thain, Douglas},title={{Not All Tasks Are Created Equal: Adaptive Resource Allocation for Heterogeneous Tasks in Dynamic Workflows}},booktitle={{WORKS Workshop on Workflows at Supercomputing}},year={2021},cclpaperid={978},keywords={workqueue, resource_monitor},}
Lightweight Function Monitors for Fine-Grained Management in Large Scale Python Applications
Tim Shaffer, Zhuozhao Li, Ben Tovar, Yadu Babuji, TJ Dasso, Zoe Surma, Kyle Chard, Ian Foster, and Douglas Thain
In IEEE International Parallel and Distributed Processing Symposium, 2021
@inproceedings{lfm-ipdps-2021,author={Shaffer, Tim and Li, Zhuozhao and Tovar, Ben and Babuji, Yadu and Dasso, TJ and Surma, Zoe and Chard, Kyle and Foster, Ian and Thain, Douglas},title={{Lightweight Function Monitors for Fine-Grained Management in Large Scale Python Applications}},booktitle={{IEEE International Parallel and Distributed Processing Symposium}},year={2021},note={{doi: 10.1109/IPDPS49936.2021.00088}},cclpaperid={968},keywords={workqueue, resource_monitor},}
Harnessing HPC resources for CMS jobs using a Virtual Private Network
Benjamin Tovar, Brian Bockelman, Michael Hildreth, Kevin Lannon, and Douglas Thain
In 25th International Conference on Computing in High Energy and Nuclear Physics (CHEP), 2021
@inproceedings{vpn-chep-2021,author={Tovar, Benjamin and Bockelman, Brian and Hildreth, Michael and Lannon, Kevin and Thain, Douglas},title={{Harnessing HPC resources for CMS jobs using a Virtual Private Network}},booktitle={{25th International Conference on Computing in High Energy and Nuclear Physics (CHEP)}},year={2021},note={{doi: 10.1051/epjconf/202125102032}},cclpaperid={973},keywords={workqueue},}
Autoscaling High Throughput Workloads on Container Orchestrators
Chao Zheng, Nathaniel Kremer-Herman, Tim Shaffer, and Douglas Thain
@inproceedings{autoscaling-cluster-2020,author={Zheng, Chao and Kremer-Herman, Nathaniel and Shaffer, Tim and Thain, Douglas},title={{Autoscaling High Throughput Workloads on Container Orchestrators }},booktitle={{IEEE Conference on Cluster Computing}},pages={1-10},year={2020},note={{doi: 10.1109/CLUSTER49012.2020.00024}},cclpaperid={967},keywords={workqueue},}
Dynamic Sizing of Continuously Divisible Jobs for Heterogeneous Resources
Nick Hazekamp, Ben Tovar, and Douglas Thain
In IEEE International Conference on e-Science, 2019
@inproceedings{sizing-escience-2019,author={Hazekamp, Nick and Tovar, Ben and Thain, Douglas},title={{Dynamic Sizing of Continuously Divisible Jobs for Heterogeneous Resources}},booktitle={{IEEE International Conference on e-Science}},year={2019},note={{doi: 10.1109/eScience.2019.00026}},cclpaperid={962},keywords={workqueue},}
A Lightweight Model for Right-Sizing Master-Worker Applications
Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain
@inproceedings{capacity-sc18,author={Kremer-Herman, Nathaniel and Tovar, Benjamin and Thain, Douglas},title={{A Lightweight Model for Right-Sizing Master-Worker Applications}},booktitle={{ACM/IEEE Supercomputing (SC)}},year={2018},note={{doi: 10.1109/SC.2018.00042}},cclpaperid={955},keywords={workqueue},}
MAKER as a Service: Moving HPC applications to Jetstream Cloud
Nicholas Hazekamp, Upendra Kumar Devisetty, Nirav Merchant, and Douglas Thain
In IEEE International Conference on Cloud Engineering, 2018
@inproceedings{maker-service-ic2e2018,author={Hazekamp, Nicholas and Devisetty, Upendra Kumar and Merchant, Nirav and Thain, Douglas},title={{MAKER as a Service: Moving HPC applications to Jetstream Cloud}},booktitle={{IEEE International Conference on Cloud Engineering}},pages={6},year={2018},note={{doi: 10.1109/IC2E.2018.00029}},cclpaperid={946},keywords={workqueue},}
@inproceedings{shadho-wacv-2018,author={Kinnison, Jeffrey and Kremer-Herman, Nathaniel and Thain, Douglas and Scheirer, Walter},title={{SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization}},booktitle={{IEEE Winter Conference on Applications of Computer Vision}},pages={1-10},year={2018},note={{doi: 10.1109/WACV.2018.00086}},cclpaperid={947},keywords={workqueue},}
A Job Sizing Strategy for High-Throughput Scientific Workflows
Benjamin Tovar, Rafael Ferreira Silva, Gideon Juve, Ewa Deelman, William Allcock, Douglas Thain, and Miron Livny
IEEE Transactions on Parallel and Distributed Systems, 2018
@article{tovar-tpds-2017,author={Tovar, Benjamin and da Silva, Rafael Ferreira and Juve, Gideon and Deelman, Ewa and Allcock, William and Thain, Douglas and Livny, Miron},title={{A Job Sizing Strategy for High-Throughput Scientific Workflows}},journal={{IEEE Transactions on Parallel and Distributed Systems}},volume={29},number={2},pages={240-253},year={2018},note={{doi: 10.1109/TPDS.2017.2762310}},cclpaperid={941},keywords={workqueue, resource_monitor},}
Towards Scalable and Dynamic Social Sensing Using A Distributed Computing Framework
Daniel (Yue) Zhang, Charles (Chao) Zheng, Dong Wang, Doug Thain, Chao Huang, Xin Mu, and Greg Madey
In The 37th IEEE International Conference on Distributed Computing Systems (ICDCS 2017), 2017
@inproceedings{social-icdcs-2017,author={Zhang, Daniel (Yue) and Zheng, Charles (Chao) and Wang, Dong and Thain, Doug and Huang, Chao and Mu, Xin and Madey, Greg},title={{Towards Scalable and Dynamic Social Sensing Using A Distributed Computing Framework}},booktitle={{The 37th IEEE International Conference on Distributed Computing Systems (ICDCS 2017)}},year={2017},note={{doi: 10.1109/ICDCS.2017.196}},cclpaperid={938},keywords={workqueue},}
Designing Self-Tuning Split-Map-Merge Applications for High Cost-Efficiency in the Cloud
@article{tuning-tcc-2015,author={Rajan, Dinesh and Thain, Douglas},title={{Designing Self-Tuning Split-Map-Merge Applications for High Cost-Efficiency in the Cloud}},journal={{IEEE Transactions on Cloud Computing}},volume={5},number={2},pages={303-316},year={2017},note={{doi: 10.1109/TCC.2015.2415780}},cclpaperid={909},keywords={makeflow, workqueue, hecura},}
PRUNE: A Preserving Run Environment for Reproducible Computing
@inproceedings{prune-escience-2016,author={Ivie, Peter and Thain, Douglas},title={{PRUNE: A Preserving Run Environment for Reproducible Computing}},booktitle={{IEEE Conference on e-Science}},year={2016},note={{doi: 10.1109/eScience.2016.7870886}},cclpaperid={930},keywords={workqueue, prune, daspos},}
Scaling Up a CMS Tier-3 Site with Campus Resources and a 100Gb/s Network Connection: What Could Go Wrong?
Matthias Wolf, Anna Woodard, Wenzhao Li, Kenyi Hurtado Anampa, Benjamin Tovar, Paul Brenner, Kevin Lannon, Mike Hildreth, and Douglas Thain
In International Conference on Computing in High Energy Physics, 2016
@inproceedings{scaling-chep2016,author={Wolf, Matthias and Woodard, Anna and Li, Wenzhao and Anampa, Kenyi Hurtado and Tovar, Benjamin and Brenner, Paul and Lannon, Kevin and Hildreth, Mike and Thain, Douglas},title={{Scaling Up a CMS Tier-3 Site with Campus Resources and a 100Gb/s Network Connection: What Could Go Wrong?}},booktitle={{International Conference on Computing in High Energy Physics}},year={2016},note={{doi: 10.1088/1742-6596/898/8/082041}},cclpaperid={935},keywords={workqueue},}
Scaling Data Intensive Physics Applications to 10k Cores on Non-Dedicated Clusters with Lobster
Anna Woodard, Matthias Wolf, Charles Mueller, Nil Valls, Ben Tovar, Patrick Donnelly, Peter Ivie, Kenyi Hurtado Anampa, Paul Brenner, Douglas Thain, Kevin Lannon, and Michael Hildreth
@inproceedings{lobster-cluster-2015,author={Woodard, Anna and Wolf, Matthias and Mueller, Charles and Valls, Nil and Tovar, Ben and Donnelly, Patrick and Ivie, Peter and Anampa, Kenyi Hurtado and Brenner, Paul and Thain, Douglas and Lannon, Kevin and Hildreth, Michael},title={{Scaling Data Intensive Physics Applications to 10k Cores on Non-Dedicated Clusters with Lobster}},booktitle={{IEEE Conference on Cluster Computing}},year={2015},cclpaperid={915},keywords={workqueue, hep},}
Integrating Containers into Workflows: A Case Study Using Makeflow, Work Queue, and Docker
Charles (Chao) Zheng and Douglas Thain
In Workshop on Virtualization Technologies in Distributed Computing (VTDC), 2015
@inproceedings{wq-docker-vtdc15,author={Zheng, Charles (Chao) and Thain, Douglas},title={{Integrating Containers into Workflows: A Case Study Using Makeflow, Work Queue, and Docker}},booktitle={{Workshop on Virtualization Technologies in Distributed Computing (VTDC)}},year={2015},note={{doi: 10.1145/2755979.2755984}},cclpaperid={910},keywords={makeflow, workqueue},}
Exploiting Volatile Opportunistic Computing Resources with Lobster
Anna Woodard, Matthias Wolf, Charles Nicholas Mueller, Ben Tovar, Patrick Donnelly, Kenyi Hurtado Anampa, Paul Brenner, Kevin Lannon, and Michael Hildreth
@inproceedings{lobster-poster-chep-2015,author={Woodard, Anna and Wolf, Matthias and Mueller, Charles Nicholas and Tovar, Ben and Donnelly, Patrick and Anampa, Kenyi Hurtado and Brenner, Paul and Lannon, Kevin and Hildreth, Michael},title={{Exploiting Volatile Opportunistic Computing Resources with Lobster}},booktitle={{Computing in High Energy Physics}},year={2015},cclpaperid={918},keywords={workqueue},}
AWE-WQ: Fast-Forwarding Molecular Dynamics using the Accelerated Weighted Ensemble
Badi Abdul-Wahid, Haoyun Feng, Dinesh Rajan, Ronan Costaouec, Eric Darve, Douglas Thain, and Jesus A. Izaguirre
Journal of Chemical Information and Modeling, 2014
@article{awe-jcim-2014,author={Abdul-Wahid, Badi and Feng, Haoyun and Rajan, Dinesh and Costaouec, Ronan and Darve, Eric and Thain, Douglas and Izaguirre, Jesus A.},title={{AWE-WQ: Fast-Forwarding Molecular Dynamics using the Accelerated Weighted Ensemble}},journal={{Journal of Chemical Information and Modeling}},volume={54},number={10},pages={3033-3043},year={2014},note={{doi: 10.1021/ci500321g}},cclpaperid={907},keywords={workqueue, awe},}
Scaling Up Genome Annotation with MAKER and Work Queue
Andrew Thrasher, Zachary Musgrave, Brian Kachmark, Douglas Thain, and Scott Emrich
International Journal of Bioinformatics Research and Applications, 2014
@article{maker-wq-ijbra,author={Thrasher, Andrew and Musgrave, Zachary and Kachmark, Brian and Thain, Douglas and Emrich, Scott},title={{Scaling Up Genome Annotation with MAKER and Work Queue}},journal={{International Journal of Bioinformatics Research and Applications}},volume={10},number={4-5},pages={447-460},year={2014},note={{doi: 10.1504/IJBRA.2014.062994}},cclpaperid={904},keywords={workqueue},}
Accelerating Comparative Genomics Workflows in a Distributed Environment with Optimized Data Partitioning
Olivia Choudhury, Nicholas L. Hazekamp, Douglas Thain, and Scott Emrich
In C4BIO Workshop at IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2014
@inproceedings{bio-partition-c4bio-grid14,author={Choudhury, Olivia and Hazekamp, Nicholas L. and Thain, Douglas and Emrich, Scott},title={{Accelerating Comparative Genomics Workflows in a Distributed Environment with Optimized Data Partitioning}},booktitle={{C4BIO Workshop at IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)}},year={2014},cclpaperid={903},keywords={makeflow, workqueue},}
Making Work Queue Cluster-Friendly for Data Intensive Scientific Applications
Michael Albrecht, Dinesh Rajan, and Douglas Thain
In IEEE International Conference on Cluster Computing, 2013
@inproceedings{wqh-cluster13,author={Albrecht, Michael and Rajan, Dinesh and Thain, Douglas},title={{Making Work Queue Cluster-Friendly for Data Intensive Scientific Applications}},booktitle={{IEEE International Conference on Cluster Computing}},year={2013},note={{doi: 10.1109/CLUSTER.2013.6702628}},cclpaperid={898},keywords={workqueue, awe},}
Case Studies in Designing Elastic Applications
Dinesh Rajan, Andrew Thrasher, Badi Abdul-Wahid, Jesus A Izaguirre, Scott Emrich, and Douglas Thain
In 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2013
@inproceedings{casestudies-ccgrid13,author={Rajan, Dinesh and Thrasher, Andrew and Abdul-Wahid, Badi and Izaguirre, Jesus A and Emrich, Scott and Thain, Douglas},title={{Case Studies in Designing Elastic Applications}},booktitle={{ 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)}},year={2013},note={{doi: 0.1109/CCGrid.2013.46}},cclpaperid={893},keywords={workqueue},}
A Framework for Scalable Genome Assembly on Clusters, Clouds, and Grids
Christopher Moretti, Andrew Thrasher, Li Yu, Michael Olson, Scott Emrich, and Douglas Thain
IEEE Transactions on Parallel and Distributed Systems, 2012
@article{assembly-tpds,author={Moretti, Christopher and Thrasher, Andrew and Yu, Li and Olson, Michael and Emrich, Scott and Thain, Douglas},title={{A Framework for Scalable Genome Assembly on Clusters, Clouds, and Grids}},journal={{IEEE Transactions on Parallel and Distributed Systems}},volume={23},number={12},year={2012},note={{doi: 10.1109/TPDS.2012.80}},cclpaperid={100},keywords={workqueue, sand},}
Folding Proteins at 500 ns/hour with Work Queue
Badi Abdul-Wahid, Li Yu, Dinesh Rajan, Haoyun Feng, Eric Darve, Douglas Thain, and Jesus A. Izaguirre
In 8th IEEE International Conference on eScience (eScience 2012), 2012
@inproceedings{folding-escience12,author={Abdul-Wahid, Badi and Yu, Li and Rajan, Dinesh and Feng, Haoyun and Darve, Eric and Thain, Douglas and Izaguirre, Jesus A.},title={{Folding Proteins at 500 ns/hour with Work Queue}},booktitle={{8th IEEE International Conference on eScience (eScience 2012)}},year={2012},note={{doi: 10.1109/eScience.2012.6404429}},cclpaperid={891},keywords={workqueue, awe},}
Shifting the Bioinformatics Computing Paradigm: A Case Study in Parallelizing Genome Annotation Using Maker and Work Queue
Andrew Thrasher, Zachary Musgrave, Douglas Thain, and Scott Emrich
In IEEE International Conference on Computational Advances in Bio and Medical Sciences, 2012
@inproceedings{maker-iccabs12,author={Thrasher, Andrew and Musgrave, Zachary and Thain, Douglas and Emrich, Scott},title={{Shifting the Bioinformatics Computing Paradigm: A Case Study in Parallelizing Genome Annotation Using Maker and Work Queue}},booktitle={{IEEE International Conference on Computational Advances in Bio and Medical Sciences}},year={2012},cclpaperid={102},keywords={workqueue},}
Converting a High Performance Application to an Elastic Cloud Application
Dinesh Rajan, Anthony Canino, Jesus A Izaguirre, and Douglas Thain
In The 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011), 2011
@inproceedings{elasticrepex-cloudcom11,author={Rajan, Dinesh and Canino, Anthony and Izaguirre, Jesus A and Thain, Douglas},title={{Converting a High Performance Application to an Elastic Cloud Application}},booktitle={{The 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011)}},year={2011},cclpaperid={93},keywords={workqueue},}
Work Queue + Python: A Framework For Scalable Scientific Ensemble Applications
Peter Bui, Dinesh Rajan, Badi Abdul-Wahid, Jesus Izaguirre, and Douglas Thain
In Workshop on Python for High Performance and Scientific Computing (PyHPC) at the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing) , 2011
@inproceedings{wq-python-pyhpc2011,author={Bui, Peter and Rajan, Dinesh and Abdul-Wahid, Badi and Izaguirre, Jesus and Thain, Douglas},title={{Work Queue + Python: A Framework For Scalable Scientific Ensemble Applications}},booktitle={{Workshop on Python for High Performance and Scientific Computing (PyHPC) at the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing) }},year={2011},cclpaperid={95},keywords={workqueue, awe},}
Adapting Bioinformatics Applications for Heterogeneous Systems: A Case Study
Irena Lanc, Peter Bui, Douglas Thain, and Scott Emrich
In Emerging Computational Methods for the Life Sciences Workshop at ACM HPDC, 2011
@inproceedings{adapting-ecmls11,author={Lanc, Irena and Bui, Peter and Thain, Douglas and Emrich, Scott},title={{Adapting Bioinformatics Applications for Heterogeneous Systems: A Case Study}},booktitle={{Emerging Computational Methods for the Life Sciences Workshop at ACM HPDC}},pages={7-13},year={2011},note={{doi: 10.1145/1996023.1996025}},cclpaperid={94},keywords={workqueue},}
Harnessing Parallelism in Multicore Clusters with the All-Pairs, Wavefront, and Makeflow Abstractions
Li Yu, Christopher Moretti, Andrew Thrasher, Scott Emrich, Kenneth Judd, and Douglas Thain
@article{abstr-jcc,author={Yu, Li and Moretti, Christopher and Thrasher, Andrew and Emrich, Scott and Judd, Kenneth and Thain, Douglas},title={{Harnessing Parallelism in Multicore Clusters with the All-Pairs, Wavefront, and Makeflow Abstractions}},journal={{Journal of Cluster Computing}},volume={13},number={3},pages={243-256},year={2010},note={{doi: 10.1007/s10586-010-0134-7}},cclpaperid={83},keywords={makeflow, workqueue, allpairs, wavefront, hecura},}
Abstractions for Cloud Computing with Condor
Douglas Thain and Christopher Moretti
In Cloud Computing and Software Services: Theory and Techniques, 2010
@incollection{abstr-cloudbook,author={Thain, Douglas and Moretti, Christopher},title={{Abstractions for Cloud Computing with Condor}},editor={Ahson, Syed and Ilyas, Mohammad},booktitle={{Cloud Computing and Software Services: Theory and Techniques}},pages={153-171},publisher={CRC Press},year={2010},note={{isbn: 9781439803158}},cclpaperid={78},keywords={workqueue, wavefront, hecura},}
Weaver: Integrating Distributed Computing Abstractions into Scientific Workflows using Python
Peter Bui, Li Yu, and Douglas Thain
In Challenges of Large Applications in Distributed Environments at ACM HPDC 2010, 2010
@inproceedings{weaver-clade10,author={Bui, Peter and Yu, Li and Thain, Douglas},title={{Weaver: Integrating Distributed Computing Abstractions into Scientific Workflows using Python}},booktitle={{Challenges of Large Applications in Distributed Environments at ACM HPDC 2010}},year={2010},note={{doi: 10.1145/1851476.1851570}},cclpaperid={86},keywords={workqueue, hecura},}
Scalable Modular Genome Assembly on Campus Grids
Christopher Moretti, Michael Olson, Scott Emrich, and Douglas Thain
@techreport{assembly-tr,author={Moretti, Christopher and Olson, Michael and Emrich, Scott and Thain, Douglas},title={{Scalable Modular Genome Assembly on Campus Grids}},institution={{University of Notre Dame, Computer Science and Engineering Department}},number={2009-04},year={2009},cclpaperid={77},keywords={workqueue, sand},}
Harnessing Parallelism in Multicore Clusters with the All-Pairs and Wavefront Abstractions
Li Yu, Christopher Moretti, Scott Emrich, Kenneth Judd, and Douglas Thain
In IEEE High Performance Distributed Computing, 2009
@inproceedings{abstr-hpdc09,author={Yu, Li and Moretti, Christopher and Emrich, Scott and Judd, Kenneth and Thain, Douglas},title={{Harnessing Parallelism in Multicore Clusters with the All-Pairs and Wavefront Abstractions}},booktitle={{IEEE High Performance Distributed Computing}},pages={1-10},year={2009},note={{doi: 10.1145/1551609.1551613}},cclpaperid={5},keywords={workqueue, allpairs, wavefront, hecura},}