Code Generation Framework for Stencil Codes


Evolving Efficient and Generalizable Multigrid Methods with Genetic Programming


Higher-Order Discretizations for the Shallow Water Equations

The ExaStencils code generation framework processes input in its own multi-layered domain-specific language (DSL) to emit highly optimized and massively parallel geometric multigrid solvers for (block-)structured grids.


Domain-specific representation and modeling

The ExaStencils code generation framework comes with an in-house source-to-source compiler written in Scala and an multi-layered external DSL called ExaSlang which is geared towards the scope of Multigrid. Each layer is tailored for a certain user community and has a different degree of abstraction. This way, domain experts can formulate problems in a manner they are most familiar with, resulting to a separation of concerns and improved productivity.

With a rising layer number the DSL becomes more concrete, i.e. Layer 1 is the most abstract. In total there exist four layers:

  • ExaSlang 1: Continuous formulation of the problem. LaTeX-like syntax but also allows specifications with Unicode symbols. Proposed for natural scientists and engineers with only little programming expertise.
  • ExaSlang 2: Discretization of the problem using Finite Differences (FD), Finite Volumes (FV) or Finite Elements (FE). Mostly used by domain experts for a certain application field, e.g. CFD.
  • ExaSlang 3: Definition of the solver in a Matlab-ish syntax. Here, Multigrid has its first appearance. Since this requires deep insight to Multigrid and its algorithmic components, this layer is best suited for mathematicians.
  • ExaSlang 4: Holds the whole program specification and follows the paradigm of procedural programming. Makes parallelization-related concepts such as communication available for further fine-tuning. Thus, this layer is mostly used by computer scientists.

The implementation of the problem can be done by providing different algorithmic components on different layers or by working on one layer exclusively and generating code for the more concrete layers from it. It is also possible to write the whole program entirely in Layer 4.

Platform Versatility

ExaSlang input can be mapped towards different hardware platforms:

  • CPUs
  • GPUs
  • FPGAs
  • ARM
  • and hybrid architectures

Parallelization Backends

ExaStencils provides an automatic parallelization and optimized communication schemes using:

  • MPI
  • OpenMP (on CPUs)
  • CUDA
  • and blends of them

Automatic Optimizations

ExaStencils provides code transformations for further performance improvements. Amongst others, this includes:

  • Vectorization
  • Loop unrolling
  • Common subexpression elimination
  • Address precalculation
  • General arithmetic simplifications
  • Polyhedral optimizations

Supercomputing Power

ExaStencils is capable of generating highly optimized and scalable code that utilizes the hardware resources of nowadays HPC clusters efficiently. Promising scaling experiments were performed on high-tier systems in the Top500.
Amongst others, this includes the following well-known clusters:

Application Scope

ExaStencils goes beyond simple applications and can be used for simulations in the scope of:

  • Computational Fluid Dynamics, especially for the Stokes, Navier-Stokes and Shallow Water equations
  • Image Processing
  • Molecular Dynamics
  • and many others

These require specialized features such as:

  • Computational grids: can be (non-)uniform, (non-)axis-aligned, staggered
  • Different discretization techniques and boundary handling
  • Various solvers and their algorithmic components, e.g. (non-)linear multigrid, point-based-, block- and Krylov-subspace-smoothers, etc.



A popular field of research is the simulation of ocean currents, tides and coastal ocean circulation. Therefore, the shallow water equations can be used as a model. One efficient approach to discretize and solve them is the discontinuous Galerkin method because of its high parallelizability, its ability to use high-order approximation spaces, its robustness for problems with shocks but also due to its natural support for h- and p-adaptivity. The GHODDESS project extends the ExaStencils toolchain with a Python front-end that allows users to easily specify their discrete problem in a symbolic algebra description, that is, by using the symbolic algebra package SymPy. GHODDESS then maps the symbolic formulation to an ExaSlang specification and thus makes fully use of ExaStencils code generation capabilities.


Since the construction of efficient Multigrid solvers for the (non-)linear systems arising from the discretization of partial differential equations can be a challenging task, EvoStencils presents an approach to tackle this problem using a genetic program optimization technique. Basically, a solver is represented by a tree of mathematical expressions which is generated based on a tailored grammar. Each solver's quality is evaluated depending on two objectives, namely compute performance and convergence rate. The former is estimated using the roofline model, whereas for the latter local fourier analysis is applied. This multi-objective optimization is done via grammar-guided genetic programming and evolution strategies. With this approach, efficient and scalable Multigrid solvers can be automatically constructed. EvoStencils is also implemented in Python and utilizes the DEAP framework for the implementation of evolutionary algorithms.


ExaStencils is funded by the German Research Foundation (DFG) as part of the Priority Programme 1648 (Software for Exascale Computing). The first funding period from January 2013 to December 2015 has been completed. The project is in its second funding period from January 2016 to December 2018, with an unfunded extension of one year.

Project Particulars

LFA Lab: A library for convergence rate prediction using local Fourier analysis.

Multigrid methods have many parameters that influence the execution time of the method. To identify a good parameter configuration, we need to predict the execution time of the method for various parameter configurations.

It is not sufficient to predict the time of the individual operations that a method performs, but it is also necessary to predict their number. Multigrid methods are iterative, i.e., they compute a sequence of iterations that converge towards the solution sought. The number of iterations needed to reach a certain accuracy depends on the convergence rate of the method.

The parameters of a multigrid method influence its convergence rate. The convergence rate is not available directly, but it can be estimated using local Fourier analysis.

As part of project ExaStencils, we developed LFA Lab, a flexible software library that performs a local Fourier analysis. LFA Lab takes as input the formula for the error propagator of the method and a description of the operations involved in this formula. Using this information, LFA Lab is able to predict the convergence rate of a given multigrid method.

By combining the convergence rate estimate with a performance model, it becomes possible to predict the execution time of a multigrid method for a given parameter configuration. This knowledge can then be used to identify an optimal parameter configuration.

Github Project