The Makeflow Workflow System
Makeflow is a workflow system for executing large
complex workflows on clusters, clouds, and grids.
Makeflow is easy to use.
The Makeflow language is similar to traditional Make, so if you
can write a Makefile, then you can write a Makeflow.
A workflow can be just a few commands chained together,
or it can be a complex application consisting of thousands
of tasks. It can have an arbitrary DAG structure and
is not limited to specific patterns.
Makeflow is production-ready.
Makeflow is used on a daily basis to execute complex scientific applications
in fields such as data mining, high energy physics,
image processing, and bioinformatics. It has run on campus clusters,
the Open Science Grid, NSF XSEDE machines, and NCSA Blue Waters.
Here are some real examples of workflows used in production systems:
(Makeflow Examples Repository)
Makeflow is portable. A workflow is written in a technology
neutral way, and then can be deployed to a variety of different
systems without modification, including local execution on
a single multicore machine as well as batch systems like HTCondor, SGE, PBS, Torque, SLURM, or the bundled Work Queue system.
Makeflow can also easily run your jobs in a container environment
like Docker or Singularity on top of an existing batch system.
The same specification works for all systems, so you can easily move
your application from one system to another without rewriting everything.
Makeflow is powerful. Makeflow can handle workloads of millions
of jobs running on thousands of machines for months at a time.
Makeflow is highly fault tolerant: it can crash or be killed,
and upon resuming, will reconnect to running jobs and continue
where it left off. A variety of analysis tools are available to
understand the performance of your jobs, measure the progress of a workflow,
and visualize what is going on.
Makeflow User's Manual
Makeflow Tutorial Slides
Makeflow Example Repository
Getting Help with Makeflow
(Showing papers with tag makeflow. See all papers instead.)
Nicholas Hazekamp, Nathaniel Kremer-Herman, Benjamin Tovar, Haiyan Meng, Olivia Choudhury, Scott Emrich, and Douglas Thain,
Combining Static and Dynamic Storage Management for Data Intensive Scientific Workflows,
IEEE Transactions on Parallel and Distributed Systems, 29(2), pages 338-350, February, 2018. DOI: 10.1109/TPDS.2017.2764897
Nicholas Hazekamp, Olivia Choudhury, Sandra Gesing, Scott Emrich, and Douglas Thain,
Poster: Expanding Tasks of Logical Workflows into Independent Workflows for Improved Scalability,
IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 548-549, January, 2014. DOI: 10.1109/CCGrid.2014.84
Peter Bui, Li Yu, Andrew Thrasher, Rory Carmichael, Irena Lanc, Patrick Donnelly, Douglas Thain,
Scripting distributed scientific workflows using Weaver,
Concurrency and Computation: Practice and Experience, 24(15), November, 2011. DOI: 10.1002/cpe.1871
Andrew Thrasher, Rory Carmichael, Peter Bui, Li Yu, Douglas Thain, and Scott Emrich,
Taming Complex Bioinformatics Workflows with Weaver, Makeflow, and Starch,
Workshop on Workflows in Support of Large Scale Science, pages 1-6, November, 2010. DOI: 10.1109/WORKS.2010.5671858
Li Yu, Christopher Moretti, Andrew Thrasher, Scott Emrich, Kenneth Judd, and Douglas Thain,
Harnessing Parallelism in Multicore Clusters with the All-Pairs, Wavefront, and Makeflow Abstractions,
Journal of Cluster Computing, 13(3), pages 243-256, September, 2010. DOI: 10.1007/s10586-010-0134-7