Inaugural MPI-Beyond Workshop

March 7-8, 2019

Edney Innovation Center, 5th Floor - Enterprise Center Space, Chattanooga, TN

Agenda and Bios (PDF)

Thursday, March 7

2:00-2:15pm Convene; Brief Introduction - Tony SkjellumUniversity of Tennessee at Chattanooga 
2:15-3:00pm Keynote #1: Meeting the Future Needs of HPC with MPI (View Presentation) Ron Brightwell, Sandia National Laboratories
  MPI will likely continue to be the dominant infrastructure used for parallel programming  and network communication in high performance computing for the next several years. However, as the commercial data center and cloud computing communities begin to re-discover the fundamentals of  HPC networking, and as the traditional HPC workload expands to include more than just modeling and simulation applications executing on a space-shared batch-scheduled machine, MPI will need to move beyond its current capabilities. This talk will offer a perspective on aspects of future more heterogeneous systems relevant to MPI and describe capabilities that MPI must provide to meet the future needs of HPC systems and workloads.
3:00-3:30pm BREAK (REFRESHMENTS)  
3:30-4:00pm What is an MPI process? (View Presentation) - Dan Holmes, EPCC, University of Edinburgh

We know that an MPI process is not an OS process, nor is it a thread or a collection of threads.

Can we instead define an MPI process by its behaviour?

“An MPI process is anything that can pass messages to/from other MPI processes in the manner specified by the MPI Standard.” In between, it can presumably do other things, such as calculation, which are not part of the MPI Standard.

That definition gives us a lot more freedom than the outdated links to OS processes and the unnecessary restrictions imposed by rules intended for POSIX processes.

What would that new freedom cost us and what can we do with it? My answer focuses on future heterogeneous systems for extreme scale HPC; it draws on Endpoints, Finepoints, tasks, and agents; it breaks (and then fixes) communicators, windows, and files; and it links up with Sessions and resilience.

4:00-4:30pm A disruptive innovation scenario on MPI (View Presentation) - Atsushi Hori, Riken Center for Computational Science
  The scenario of disruptive innovation on MPI came to my mind while I was thinking about the panel talk at EuroMPI/USA 2017. This talk will be the follow-up of my panel talk. I will talk about the future of MPI based on the disruptive innovation idea. 
4:30-5:00pm Too long; didn't read (View Presentation) - Ken Rafennetti, Argonne National Laboratory 

MPI is the dominant programming model for high-performance computing. Its success was born from simple, elegant interfaces for common patterns. However, at 800 pages and counting, modern MPI may be suffering from a lack of focus. In this talk, using my perspective from user support and software engineering, I will outline an approach for a streamlined MPI.

5:00-5:30pm Filling the gaps between data analytics and high performance computing with some help from MPI (View Presentation) - Michela Taufer, University of Tennessee, Knoxville

The MapReduce programming model has been packaged in multiple formats to become the dominant programming model for analyzing large volumes of data. Most implementations of the model were initially designed for Cloud and commodity clusters. When moved to supercomputers these implementations often exhibit their inability to cope with features that are common in Cloud but missing in HPC (e.g., on-node persistent storage) or to leverage features that are common in HPC but missing in Cloud (e.g., fast interconnects).

In this talk we will discuss how the programming model should be redesigned to assure the convergence between data analytics and high performance computing.

Specifically, we will present Mimir, an MapReduce implementation over MPI that bridges the gap between data analytics and supercomputing by enabling the in-memory analysis of significantly larger datasets on HPC platforms without sacrificing the ease-of-use MapReduce provides. We will show how fundamental operations such as word count, join, and clustering can achieve large performance gains while mitigating power usage on supercomputers when executed on top of Mimir.


Conclude for Day #1


Dinner (place to be determined) - group dinner (but on your own (in terms of payment))

Friday, March 8

8:00-8:30am Arrive, Breakfast, reconvene...

Malleability is the key for MPI in a Post-Exascale World (View Presentation) - Martin Schulz, Technische Universität München (TUM)


Current HPC environments and applications are rather rigid and inflexible, and MPI’s inability to efficiently support malleability, i.e., the ability to grow and shrink the computational resources associated with a job at runtime, is a significant part of the problem. While this is likely not going to change for exascale systems anymore, in a Post-Exascale world, however, we will require a more flexible approach, e.g., to support a greater level of fault tolerance, to adjust to changing levels of available resources, or to match more complex workflows.

Therefore, if we want MPI to remain the dominant programming standard in HPC for decades to come, we will have to change MPI to match this trend and support malleable applications and complex adaptive workflow. MPI Sessions is a step in the right direction, but can only be a first building block. In this talk, I will discuss some of the challenges we are facing in MPI towards more adaptivity as well as initial ideas on how to bridge the existing gaps.


Source to source between domain scientists and MPI  (View Presentation) - Hadia Ahmed, Lawrence Berkeley National Lab


High performance computing (HPC) is no longer a sacred technology used by computer experts only. Nowadays, it’s being used in major fields such as medicine, surveillance, and financial services. These domain experts are developing their applications to use HPC. Currently, MPI is the de-facto standard for writing HPC applications. It requires deep understanding from developers of how to chunk and distribute their data and how to manage communication across processors. Hence, these users have to become familiar with systems’ internal details. This gave the rise of developing domain specific languages as a mediator to solve the problem. Does this mean that MPI will be replaced with DSLs? How about adaptations to new hardwares such as remote direct memory access (RDMA) and different accelerators (e.g. GPUs and FPGAs) as they become available? Are we going to have to develop a new DSL for each emerging hardware? Is source-to-source tools the answer to these issues? This talk will explore the current and future efforts for providing a bridge between the domain experts and generating optimized HPC applications.


Could MPI outgrow its HPC calling?  (View Presentation) - George BosilcaUniversity of Tennessee, Knoxville


MPI was certainly one of the driving forces behind the rise of computational science, allowing the domain parallel applications to out-scale the node-level parallelism and tackle ever growing problems, more representative of realistic scenarios. But in the current era of fast and drastic changes at all levels in the hardware and software stack, MPI is, in many cases, seen as a outlier, a programming paradigm supporting and supported by a economically untenable niche, and incapable of adapting to today's challenges.

In this talk I will cover few topics where MPI could do a better job at providing larger developers communities with a simpler, more efficient and streamlined experience, supporting novel programming paradigms and providing a more realistic solution for future, non HPC platforms.



10:30-11:00am  MPI for (void*) (View Presentation) - Jeff Squyres, Cisco

The HPC community is well-familiar with MPI's features: discrete, typed messages, collective operations, tagged multiplexing, etc. These features -- and many others -- make MPI an excellent foundation upon which to build world-class HPC applications.

But these same features are also widely applicable to many other applications outside of HPC.  Especially in comparison to the POSIX sockets API, MPI provides so much infrastructure and so many features -- why aren't non-HPC applications using MPI?

In this talk, I will discuss the challenges of using MPI outside of the HPC ecosystem, as well as what it would take to overcome them (spoiler: it's not that hard).


I Heard a Rumor (about Exascale MPI) (View Presentaiton) - Tony SkjellumUniversity of Tennessee at Chattanooga


This talk retraces paths not taken or fulfilled in prior MPI and MPI-like work such as MPI/RT, with options for the future (most of which can break the mold) and seeks to influence the audience overtly and subliminally about research and standardization topics where we can/should/might/ought go in the future with suggestions about why features like QoS, predictability, multi-level protocols, event-driven programming models, heterogeneous devices, and aspectable middleware all should figure prominently in Exascale.


On the path to exascale: strengths and limitations of MPI (View Presentation) - Geoffroy Vallee, Oak Ridge National Laboratory


For the past decades, leadership computing has been heavily relying on the Message Passing Interface (MPI) for the development of science simulations. This trend was pushed with success to the point where new systems were designed specifically for the optimal execution of MPI-based applications.

However, with the growing interest in data sciences, artificial intelligence, machine learning and deep learning, new models now need to be supported by large-scale high performance computing (HPC) systems, moving away from a pure MPI-based approach. This switch is deeply impacting the way systems are designed (e.g., Summit has 6 accelerators per compute node), used (e.g., some task-based workloads require execution models that are more complex than the traditional job-based model; new programming languages are used) and managed (e.g., growing need for customized execution environments using containers). As a result, many in the HPC community are wondering what will be the role of MPI for the future of leadership computing. Will it be replaced by something else? Will it evolve to better match new needs?

In this presentation, i will first present the current strengths and limitations of MPI, as well as identified gaps when considering the Oak Ridge National Laboratory’s Summit system based the current trends within the U.S. Department of Energy, and more specifically the Exascale Computing Project (ECP). I will then present ideas and suggestions based on current trends driven by the upcoming exascale systems, as well as thoughts about the long term viability of MPI (i.e., beyond exascale).




Concluding Keynote: FPGAs Everywhere:  Towards Production HPC with FPGA-Enhanced Clouds and Clusters  (View Presentation) - Martin Herbordt, Boston University


In his SIAM-PP keynote, Paul Coteus described the future of HPC as being, in part, Compute Everywhere.  Our take is as follows:  As Moore’s Law ends, performance gains will come from having compute capability wherever data moves.  Also, designs will be not only application driven, but also adaptable to the application. FPGAs are the ideal devices to explore this future (and perhaps also to implement it).  Since the hardware adapts to the application, rather than the reverse, we can obtain very high efficiency at low power.  And since FPGAs are hybrid communication-computation processors that can be interconnected directly chip-to-chip, large-scale communication can proceed with both high bandwidth and low latency. 

In this talk I give an overview of our work with High Performance Computing with FPGA-enhanced clouds and clusters, building around five places where FPGAs can have a transformative impact.

  1. FPGAs as back-end accelerators, but with a twist: they are interconnected directly with a secondary network
  2. FPGAs as tightly coupled coprocessors: offload MPI processing into hardware
  3. FPGAs in the NIC: offload streaming applications and MPI collectives
  4. FPGAs as Bumps-in-the-Wire: large-scale HPC in commercial clouds
  5. FPGAs in the router: True compute-in-the-network

PANEL DISCUSSION - Dan Holmes, moderator





Time to visit top attractions - Ruby Falls/Rock City or Tennessee Aquarium


Dinner (place to be determined) -- group dinner (but on your own)... those staying overnight