ExM: system support for extreme-scale, many-task applications Ian Foster (PI), Ewing Lusk (PI), Ketan Maheshwari, Todd Munson, Michael Wilde (Lead PI), Argonne Tim Armstrong, Daniel S. Katz (PI), Justin Wozniak, Zhao Zhang, University of Chicago Sameer Al-‐Kiswany, Matei Ripeanu (PI), Emalayan Vairavanathan, University of British Columbia Problem: identify & scale up many-task applications Exascale computers will enable and demand new problem solving methods that involve many concurrent, interacting tasks. Methodologies such as rational design, uncertainty quantification, parameter estimation, and inverse modeling all have this “many-‐task” property. All will frequently have aggregate computing needs that require exascale computers. For example, proposed next-‐generation climate model ensemble studies involve 1,000 or more runs, each requiring 10K cores for a week, to characterize model sensitivity to initial condition and parameter uncertainty. Running many-‐ task applications efficiently, reliably, and easily on extreme-‐scale computers is challenging. System software designed for today’s mainstream single program multiple data (SPMD) computations is not necessarily a good match to the demands of many-‐task applications.
perform rapid, data-‐aware, and efficient dispatch of billions of small tasks to exascale computing systems and the fault-‐tolerant execution of those tasks. These components will be efficiently integrated with current and future extreme-‐scale system software and made available via parallel scripting languages and APIs. ExM Architecture Many-task application
Ultra-fast task distribution Graph executor
Virtual data store Task graph executor
Graph executor
Compute node
Graph executor
Goals CS research to achieve the technical advances required to execute many-‐task applications efficiently, reliably, and easily on petascale and exascale facilities. Create middleware that enables new problem solving methods and application classes on these extreme-‐scale systems. Impact The ExM project will produce advances in computer science and usable middleware that enables the efficient and reliable use of exascale computers for new classes of applications. The project will both accelerate access to exascale computers by important existing applications and facilitate the broader use of large-‐scale parallel computing by new application communities for which it is currently out of reach. The project will also train students and postdocs in the development and use of innovative approaches for extreme-‐scale computing. Approach To address these demands, the ExM project will design, develop, apply and evaluate two new system software components. The ExM data store will allow concurrent and asynchronous application tasks to communicate efficiently and reliably, both with each other and with persistent storage, by reading and writing data objects maintained in node-‐local storage, including memory, SSD, and local disk. The ExM parallel evaluator will
Global persistent storage
Jets prototype confirms feasibility of many-‐parallel-‐ task (MPI) programming model. Turbine prototype using ADLB showed encouraging scalability and suggests exascale goals are feasible. AME anyscale many-‐task engine and store measured BG/P scaling and data exchange to 16K-‐core level. MosaStore on Blue Gene/P and other clusters is creating model of virtual data store. Evaluate ExM tools on 3 science applications (earthquake simulation, image processing, protein/RNA interaction). 4 publications (at www.mcs.anl.gov/exm). Next milestones (Oct 2011 – Sep 2012) Extend the ExM task manager (Turbine) and its intermediate representation to run Swift and PySwift programs on BG/P and Cray XE. Model fault recovery. Integrate the MosStore and AME data stores into Turbine to provide support for scalable collective data management. Explore HDF5 or NetCDF integration. Evaluate the performance and usability of this integrate on DOE and INCITE applications: ParVis climate model analysis; SCEC earthquake simulation; SWAT biofuels landuse impact; Power grid modeling; protein structure and interaction prediction; subsurface impact modeling.
Accomplishments
ExM distributed task management targets high thread count and utilization, low latency, and resiliency in the face of failing components and interconnects. ExM complex-wide data storage – based on MosaStore – is embedded and distributed across nodes and RAM storage to provide a global namespace and fast data exchange. Task subgraph Task graph
Task graph partitioner Task subgraph
Graph executor
Task queue executor
Graph executor
Task queue executor
ExM studies DOE, INCITE and other national-priority exascale candidate applications
(a) Climate model analysis
(b) Biofuel production impact
(c) Subsurface flows
(d) UQ of electricity and energy economics
ExM studies Swi0 scripts used to specify and execute many-‐task applicaBons: (a) QA, analysis and visualizaBon of climate model outputs, (b) impact of biofuel producBon on hydrology in major US watersheds, (c) subsurface flow of chemicals in groundwater, and (d) uncertainty quanBficaBon studies of consumer and industrial electricity usage and related energy and economic factors.
ExM prototypes show encouraging scalability
(d) ADLB task dispatch scaling
(a) Scaling of AME many-‐task dispatch and RAM-‐based data store.
(b) Turbine data store access rate.
(c) Jets fault-‐tolerance test.
For more informa+on: Contact: Michael Wilde,
[email protected]
(e) Jets many-‐parallel-‐task applicaBon scaling (NAMD)
ExM Project: h$p://www.mcs.anl.gov/exm Swi$ parallel scrip-ng language: h$p://www.ci.uchicago.edu/swi7 ADLB: h$p://www.cs.mtsu.edu/~rbutler/adlb MosaStore: h$p://netsyslab.ece.ubc.ca/wiki/index.php/MosaStore