Tutorial¶
This is a short tutorial how to use the EvoStencils package. After cloning the repository, we first have to install the required Python packages.
# Install evostencils package with requirements
pip install -e .
To automatically generate a parallel implementation of each multigrid-based solver discovered through evolutionary program synthesis, EvoStencils makes use of the ExaStencils framework. Note that right now, ExaStencils is the only code generation backend supported. To install ExaStencils together with all of its requirements you can execute the following instructions.
sudo apt-get install -y g++ openjdk-11-jdk
# Install the ExaStencils framework
wget -nc https://github.com/lssfau/ExaStencils/archive/refs/tags/v1.1.zip
wget -nc https://github.com/sbt/sbt/releases/download/v1.8.0/sbt-1.8.0.zip
unzip -n v1.1.zip && mv -vn ExaStencils-1.1 exastencils
unzip -n sbt-1.8.0.zip
cd exastencils
../sbt/bin/sbt compile
../sbt/bin/sbt assembly
cd ..
Optional: Install LFA Lab for Convergence Estimation¶
Additionally, EvoStencils includes experimental support for predicting the convergence of a geometric multigrid method using the library LFA Lab. Note that while this approach is not recommended due to its limited accuracy, it can be useful for testing purposes, since it drastically speeds up the evaluation. To install LFA Lab, execute the following instructions.
# Optional: Install LFA Lab for convergence estimation
sudo apt-get install -y cmake swig libeigen3-dev liblapack-dev
git clone https://github.com/hrittich/lfa-lab.git lfa-lab
cmake -DWITH_TESTS=FALSE lfa-lab
make -j4 lfa-lab
cd lfa-lab
make install
cd ..
Overview and Problem Definition¶
After installing all requirements, we can import EvoStencils’
functionality for the automated design of multigrid methods. Here, the
Optimizer
class bundles all functionality for the execution of a
grammar-based evolutionary program synthesis that enables the automated
design of an efficient multigrid method for the given problem. As a
second component, the ProgramGenerator
class provides a frontend for
the automated generation and evaluation of each generated
multigrid-based solver using the
ExaStencils framework. To speed up
the evaluation of each solver, ExaStencils can generate an
OpenMP-parallel C++ implementation. For this purpose, the number of
threads can be specified in the file
example_problems/lib/parallelization_pureOmp.knowledge
.
from evostencils.optimization.program import Optimizer
from evostencils.code_generation.exastencils import ProgramGenerator
import os
import sys
The first step of running EvoStencils is to define the (PDE-based)
problem that we aim to solve. EvoStencils is capable of extracting all
information about a discretized PDE from its specification in
ExaStencils’ domain-specific language ExaSlang. Note that a number
of example problems are included in the example_problems
folder.
Here we use a simple two-dimensional Poisson problem as an example,
which is given by
In the given case, we choose \(\Omega = (0, 1)^2\) on which we discretize the equation using finite differences with \(h = 1/2^l\), where \(l = 8\). As a result, we obtain the usual five-point stencil given by
cwd = f'{os.getcwd()}/../'
# Path to the ExaStencils compiler
compiler_path = f'{cwd}/exastencils/Compiler/Compiler.jar'
# Path to base folder
base_path = f'{cwd}/example_problems'
# Relative path to platform file (from base folder)
platform_path = f'lib/linux.platform'
# Example problem from L2
# Relative path to settings file (from base folder)
settings_path = f'Poisson/2D_FD_Poisson_fromL2.settings'
knowledge_path = f'Poisson/2D_FD_Poisson_fromL2.knowledge'
cycle_name = "gen_mgCycle" # Default name
# Additional global parameter values within the PDE system
pde_parameter_values = None
# The maximum number of iterations considered acceptable for a solver
solver_iteration_limit = 500
# Set up MPI
comm = None
nprocs = 1
mpi_rank = 0
In case a PDE includes additional parameters that depend on the discretization, these can be dynamically adjusted during the optimization. For instance, the following code can be used to set the correct values for the wavenumber \(k\) within the provided example of the Helmholtz equation.
# Special treatment of parameters for the Helmholtz example
if "Helmholtz" in knowledge_path or "Helmholtz" in settings_path:
values = [80.0 * 2.0**i for i in range(100)]
pde_parameter_values = {'k': values}
solver_iteration_limit = 10000
If you have only access to limited compute capabilities, the model-based prediction can be employed for testing purposes. However, note that this approach leads to inaccurate evaluations and only works when the multigrid method is directly used as an iterative solver.
# Only recommended for testing purposes
# Use model based estimation instead of code generation and model_based_prediction
model_based_estimation = False
Optional: Enable MPI¶
Finding competitive multigrid methods requires the evaluation of a large number of solvers, which is usually not feasible on a single compute node. As a remedy, EvoStencils offers support for an MPI-based parallelization using the mpi4py library, which provides a convenient interface to MPI within Python. To enable a MPI-based parallelization within EvoStencils you can execute the following cell. The creation of the right number of MPI processes and their mapping to a particular computer (or cluster) architecture needs to be managed by the actual MPI implementation, such as OpenMPI. A general advice is to execute and bind each process to a dedicated multi-core node, such that the evaluation of each solver can be sped up with an OpenMP-based parallelization, which can be performed automatically by ExaStencils.
from mpi4py import MPI
comm = MPI.COMM_WORLD
nprocs = comm.Get_size()
mpi_rank = comm.Get_rank()
Code Generation Setup¶
Next, we create a ProgramGenerator
object which automatically
extracts all information about the given problem from its respective
ExaSlang specification. This object then provides an interface for the
evaluation of each solver that is independent of the backend used for
code generation (in our case ExaStencils).
program_generator = ProgramGenerator(compiler_path, base_path, settings_path, knowledge_path, platform_path, mpi_rank=mpi_rank,
cycle_name=cycle_name, model_based_estimation=model_based_estimation,
solver_iteration_limit=solver_iteration_limit)
# Obtain extracted information from program generator
dimension = program_generator.dimension # Dimensionality of the problem
finest_grid = program_generator.finest_grid # Representation of the finest grid
coarsening_factors = program_generator.coarsening_factor
min_level = program_generator.min_level # Minimum discretization level
max_level = program_generator.max_level # Maximum discretization level
equations = program_generator.equations # System of PDEs in SymPy
operators = program_generator.operators # Discretized differential operators
fields = program_generator.fields # Variables that occur within system of PDEs
problem_name = program_generator.problem_name
convergence_evaluator = None
performance_evaluator = None
Optional: Enable Model-Based Estimation¶
If you want to use a model-based prediction for evaluating the quality
of each multigrid method, the ConvergenceEvaluator
and
PerformanceEvaluator
classes can be used. The former provides an
interface to LFA Lab’s functionality that automates the process of
predicting a method’s speed of convergence. The latter implements a
simple roofline model for estimating its execution time on a given CPU
architecture. For this purpose, the user needs to specify the peak
floating-point performance and memory bandwidth of the target machine.
if model_based_estimation:
# Create convergence and performance evaluator objects
# Convergence evaluation requires a working LFA Lab installation!
# Only needed when a model-based estimation should be used within the optimization
# (Not recommended due to the limitations, but useful for testing)
from evostencils.model_based_prediction.convergence import ConvergenceEvaluator
from evostencils.model_based_prediction.performance import PerformanceEvaluator
convergence_evaluator = ConvergenceEvaluator(dimension, coarsening_factors, finest_grid)
# Peak FLOP performance of the machine
peak_flops = 16 * 6 * 2.6 * 1e9
# Peak memory bandwidth of the machine
peak_bandwidth = 45.8 * 1e9
# Number of bytes per word
bytes_per_word = 8 # Double = 64 Bit = 8 Bytes
performance_evaluator = PerformanceEvaluator(peak_flops, peak_bandwidth, bytes_per_word)
Optimization Setup¶
After defining the problem that we aim to solve and setting up the code
generation and evaluation, we can finally introduce EvoStencils’ program
synthesis toolchain. Here, the Optimizer
class provides a unified
user interface for the execution of each step towards the automated
design of an efficient multigrid method for the target problem.
To deal with unexpected system crashes and cases where an optimization
exceeds a certain time limit, EvoStencils implements a simple
checkpointing mechanism that makes use of Python’s pickle
module to
store the current state of the optimization in a file. For this purpose,
we need to create a directory for storing the respective checkpoint
files of each MPI process.
if mpi_rank == 0 and not os.path.exists(f'{cwd}/{problem_name}'):
# Create directory for checkpoints and output data
os.makedirs(f'{cwd}/{problem_name}')
# Path to directory for storing checkpoints
checkpoint_directory_path = f'{cwd}/{problem_name}/checkpoints_{mpi_rank}'
Next, we create an Optimizer
object based on the variables defined
above that contain the information extracted from the respective
ExaSlang representation. Since the Optimizer
class makes use of the
interface provided by the ProgramGenerator
(and potentially
ConvergenceEvaluator
and PerformanceEvaluator
) class to evaluate
each generated solver, the respective objects must be provided as an
argument.
optimizer = Optimizer(dimension, finest_grid, coarsening_factors, min_level, max_level, equations, operators, fields,
mpi_comm=comm, mpi_rank=mpi_rank, number_of_mpi_processes=nprocs, program_generator=program_generator,
convergence_evaluator=convergence_evaluator, performance_evaluator=performance_evaluator,
checkpoint_directory_path=checkpoint_directory_path)
Note that each multigrid method is formulated on a specific hierarchy of discretization. In the given case, each grid in this hierarchy is defined by its step size \(h\), which is given by \(h = 1/2^l\) with \(l \in [l_{min}, l_{max}]\). EvoStencils automatically generates a grammar that allows to treat the automated design of multigrid methods as a program synthesis task. Since the number of productions of this grammar increases with the number of coarsening steps, i.e. \(l_{max} - l_{min}\), the resulting search space grows accordingly and the user is advised to limit the number of levels considered to at most 4 or 5. As a remedy, EvoStencils allows to split the optimization into multiple runs. In this case, it first aims to find an efficient multigrid method on the coarsest levels, which is then employed as a coarse-grid solver for an existing multigrid implementation provided within the respective DSL code. This process is then continued on the next-higher levels until the finest grid has been reached. However, note that the outcome of this approach might be inferior to a result obtained with an optimization on the full range of levels.
# Option to split the optimization into multiple runs,
# where each run is only performed on a subrange of the discretization hierarchy starting at the top (finest grid)
levels_per_run = max_level - min_level
if model_based_estimation:
# Model-based estimation only feasible for up to 2 levels per run
levels_per_run = 2
assert levels_per_run <= 5, "Can not optimize more than 5 levels"
After creating an initial Optimizer
object, we can proceed with the
execution of the actual program synthesis. To solve the resulting
optimization problem EvoStencils utilizes evolutionary computation in
the form of grammar-guided genetic programming (G3P). This method
evolves a population of individuals represented as derivation trees,
where each tree corresponds to a unique multigrid method, based on the
principle of natural evolution. To control the behavior of this
algorithm, we define a number of variables, which are later passed to
the respective method of the Optimizer
class.
For this purpose, we first need to define a measure of the fitness of each individual based on which we can select individuals for reproduction. In general, EvoStencils provides functionality for both single and multi-objective optimization. In the case of the latter, each individual is evaluated according to two objectives: its execution time per iteration \(t\) and its convergence factor \(\rho\). More information can be found in https://rdcu.be/c3xew. To perform a multi-objective evolutionary optimization, NSGA-II represents a reasonable default choice.
Next, we need to set the number of generations, the total population
size \(\mu\) and the number of newly-created individuals (offspring)
per process \(\lambda\). For instance, running EvoStencils with four
MPI process, \(\mu = 16\) and \(\lambda = 4\) means that in each
generation a total number of 16 new individuals is created based on an
existing population of 16 individuals. Note that the values provided in
this tutorial are suitable for testing, but are unlikely to lead to a
good result. In addition, a larger (or smaller) initial population can
be chosen by setting the variable population_initialization_factor
accordingly. As an alternative to running an actual evolutionary
algorithm, EvoStencils also provides a simple random search, which
randomly generates \(\lambda\) individuals per process for the
specified number of generations.
To control the probability of creating new individuals via crossover and
mutation, the value of the parameters crossover_probability
and
mutation_probability
can be adjusted. In addition, we can set the
number of samples used for the evaluation of each solver (in the case of
a code generation-based evaluation).
# Choose optimization method
optimization_method = optimizer.NSGAII
generations = 10 # Number of generations
mu_ = 4 # Population size
lambda_ = 4 # Number of offspring
# Option to use random search instead of crossover and mutation to create new individuals
use_random_search = False
population_initialization_factor = 4 # Multiply mu_ by this factor to set the initial population size
crossover_probability = 0.9
mutation_probability = 1.0 - crossover_probability
node_replacement_probability = 0.1 # Probability to perform mutation by altering a single node in the tree
evaluation_samples = 3 # Number of evaluation samples
Before we can run the evolutionary program synthesis based on the
parameters specified, we need to discuss two additional options. First
of all, EvoStencils allows to increase the problem size during the
optimizaton dynamically. For this purpose, the parameter
generalization_interval
can be set, which specifies the number of
generations after which both \(l_{min}\) and \(l_{max}\) are
incremented. Note that this triggers a reevaluation of all individuals
in the current population. If the user does not want to apply this
feature the generalization interval can simply be set to a value greater
than the total number of generations.
As a second feature, EvoStencils allows to continue the evolutionary program synthesis from a saved checkpoint. Note that this requires the total number of generations to be higher than the generation in which the checkpoint was created. Furthermore, the symbols that occur within the grammar-based representation of each solver contained in the checkpointed population must match with the ones of the new program synthesis run. Since at this point no additional checks are performed, the user is advised to enable this feature with caution and to be prepared to deal with unexpected errors.
# Number of generations after which a generalization is performed
# This is achieved by incrementing min_level and max_level within the optimization
# Such that a larger (and potentially more difficult) instance of the same problem is considered in subsequent generations
generalization_interval = 50
# Option to continue from the checkpoint of a previous optimization
# Warning: So far no check is performed whether the checkpoint is compatible with the current optimization setting
continue_from_checkpoint = False
Running the Evolutionary Program Synthesis¶
After setting all necessary parameters, we can execute the evolutionary
program synthesis. For this purpose, the Optimizer
class provides
the method evolutionary_optimization
, which can be executed by
providing the parameters defined above as its arguments.
After the execution of this method is finished, its outcome is returned in the form of five Python objects. Here, a grammatical representation of the multigrid method that leads to the fastest solving time for the largest problem considered during the optimization is stored in the first return value. Accordingly, the second value contains the corresponding DSL code of this method.
The remaining three return values store information gathered during the optimization in the form of the final population, a statistics data structure and a hall-of-fame of the best multigrid methods according to the optimization objective(s). Note that in case the evolutionary program synthesis is performed on the full range of levels, each of these objects is a list containing only a single entry. Otherwise, the list contains an object for each respective sub-run.
# Return values of the optimization
# program: Grammar string representing the multigrid method on the topmost levels
# dsl_code: ExaSlang program string representing the multigrid solver functions
# pops: Populations at the end of each optimization run on the respective subrange of the discretization hierarchy
# stats: Statistics structure (data structure provided by the DEAP framework)
# hofs: Hall-of-fames at the end of each optimization run on the respective subrange of the discretization hierarchy
program, dsl_code, pops, stats, hofs = optimizer.evolutionary_optimization(optimization_method=optimization_method,
use_random_search=use_random_search,
mu_=mu_, lambda_=lambda_,
population_initialization_factor=population_initialization_factor,
generations=generations,
generalization_interval=generalization_interval,
crossover_probability=crossover_probability,
mutation_probability=mutation_probability,
node_replacement_probability=node_replacement_probability,
levels_per_run=levels_per_run,
evaluation_samples=evaluation_samples,
model_based_estimation=model_based_estimation,
pde_parameter_values=pde_parameter_values,
continue_from_checkpoint=continue_from_checkpoint)
Storing and Examining the Results¶
After the evolutionary program synthesis has finished, we can examine
the results, for instance by printing the DSL code of the best method
discovered during the optimization. Furthermore, we can store both the
final population and the hall-of-fame in a file using the
dump_data_structure
method of the Optimizer class.
# Print the outcome of the optimization and store the data and statistics
if mpi_rank == 0:
print(f'\nExaSlang Code:\n{dsl_code}\n', flush=True)
if not os.path.exists(f'./{problem_name}'):
os.makedirs(f'./{problem_name}')
j = 0
log_dir_name = f'./{problem_name}/data_{j}'
while os.path.exists(log_dir_name):
j += 1
log_dir_name = f'./{problem_name}/data_{j}'
os.makedirs(log_dir_name)
for i, log in enumerate(stats):
optimizer.dump_data_structure(log, f"{log_dir_name}/log_{i}.p")
for i, pop in enumerate(pops):
optimizer.dump_data_structure(pop, f"{log_dir_name}/pop_{i}.p")
for i, hof in enumerate(hofs):
hof_dir = f'{log_dir_name}/hof_{i}'
os.makedirs(hof_dir)
for j, ind in enumerate(hof):
with open(f'{hof_dir}/individual_{j}.txt', 'w') as grammar_file:
grammar_file.write(str(ind) + '\n')
Conclusion¶
We hope that this tutorial provides a reasonable introduction to
EvoStencils’ capabilities. Note that the functionality offered by the
optimize.py
script (located in the scripts
subfolder) is similar
to this notebook.