DLB Library (Dynamic Load Balancing)

Library devoted to speedup hybrid parallel applications

Institution:

Research Group:

BSC Group: Computer Sciences

Researcher/s:

Marta Garcia Gasulla, Víctor López Herrero

Website:

https://pm.bsc.es/dlb

Description:

The DLB library will improve the load balance of the outer level of parallelism (e.g. MPI) by redistributing the computational resources at the inner level of parallelism (e.g. OpenMP). This readjustment of resources will be done dynamically at runtime.

This dynamism allows DLB to react to different sources of imbalance: Algorithm, data, hardware architecture, variability and resource availability among others.

How does DLB work?
DLB will use the malleability of the inner level of parallelism to change the number of threads of the different processes running in the same node. There are different load balancing algorithms implemented within DLB. They all relay on this main idea but they target different types of applications or situations.

Who can use DLB?
Any application written in C, C++ or Fortran in any of the supported parallel programming models. The current supported parallel programming models are the following:

MPI+OpenMP
MPI+OmpSs
OmpSs (Multiple Applications)
We are open to adding support for more programming models in both inner and outer level of parallelism.

Technical Requirements:
Shared Memory between Processes: DLB needs a shared memory node and more than one process running in the same node.
Preload Mechanism: The system must provide a preolad mechanism to intercept MPI calls. (Not necessary if using Nanos++ runtime and we don't need to intercept MPI calls)
Parallel Regions in OpenMP: If using the OpenMP model DLB needs different parallel regions to open and close in order to change the number of threads (i.e. OpenMP standard only allows to change the number of threads outside a parallel region).
Non-busy waiting mode for MPI calls: To use the cpu that is waiting in communication DLB needs the MPI calls to be busy waiting. The different MPI implementations usually offer a way of obtaining this behaviour but it is not enabled by default. DLB offers a mode where the cpu where the MPI call is being executed will not be used, but the performance obtained is penalized.

Problem:

N/A

Solution:

N/A

Aplication areas:

N/A

Novelty:

N/A

Protection:

LGPL License (Version 3.0)

Target market:

N/A

Keywords:

Programming Models

TRL: N/A

CRL: N/A

BRL: N/A

IPRL: N/A

TmRL: N/A

FRL: N/A

More information

if you want to know more about this project do not hesitate to contact us