The increasing availability of machines relying on non-GPU architectures...
Ranging from NVIDIA GPUs to AMD GPUs and Intel GPUs: Given the heterogen...
Parallel programming remains a daunting challenge, from the struggle to
...
Nonlocal quasistatic fracture evolution for interacting cracks is develo...
Meeting both scalability and performance portability requirements is a
c...
Octo-Tiger, a large-scale 3D AMR code for the merger of stars, uses a
co...
Asynchronous Many-Task (AMT) runtime systems take advantage of multi-cor...
On the way to Exascale, programmers face the increasing challenge of hav...
Local-nonlocal coupling approaches provide a means to combine the
comput...
Partition of unity methods (PUM) are of domain decomposition type and pr...
Octo-Tiger is a code for modeling three-dimensional self-gravitating
ast...
In this work, we consider the challenges of developing a distributed sol...
Analyzing performance within asynchronous many-task-based runtime system...
Arm technology is becoming increasingly important in HPC. Recently, Fuga...
Although recent scaling up approaches to train deep neural networks have...
In this paper, we propose two approaches to apply boundary conditions fo...
OpenMP has been the de facto standard for single node parallelism for mo...
We study the simulation of stellar mergers, which requires complex
simul...
Asynchronous Many-task (AMT) runtime systems have gained increasing
acce...
Experience shows that on today's high performance systems the utilizatio...
Despite advancements in the areas of parallel and distributed computing,...
Peridynamics is a non-local generalization of continuum mechanics tailor...
Experimental data availability is a cornerstone for reproducibility in
e...