Recrutement INRIA

Kokkos Supporting For Complex Data Discretization And Unstructured Meshes H/F - INRIA

  • Palaiseau - 91
  • CDI
  • INRIA
Publié le 22 juin 2026
Postuler sur le site du recruteur

Les missions du poste


A propos d'Inria

Inria est l'institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l'interface d'autres disciplines. L'institut fait appel à de nombreux talents dans plus d'une quarantaine de métiers différents. 900 personnels d'appui à la recherche et à l'innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.
Kokkos supporting for complex data discretization and unstructured meshes
Le descriptif de l'offre ci-dessous est en Anglais
Niveau de diplôme exigé : Thèse ou équivalent

Fonction : Ingénieur scientifique contractuel

Niveau d'expérience souhaité : De 3 à 5 ans

A propos du centre ou de la direction fonctionnelle

The Inria Saclay-Île-de-France Research Centre was established in 2008. It has developed as part of the Saclay site in partnership with Paris-Saclay University and with the Institut Polytechnique de Paris .

The centre has , 27 of which operate jointly with Paris-Saclay University and the Institut Polytechnique de Paris; Its activities occupy over 600 people, scientists and research and innovation support staff, including 44 different nationalities.

Contexte et atouts du poste

PEPR NumPEx & KOKTAILS

The transition to Exascale computing architectures requires a renewal of programming paradigms
to efficiently leverage accelerators (GPUs, TPUs, and others). This transition presents a
significant challenge for existing application codes, as a complete rewrite is often a massive undertaking.
The KOKTAILS project aims to address these challenges by proposing an advanced programming environment
that facilitates the porting of codes to heterogeneous architectures while ensuring performance portability.
The KOKTAILS project aims to develop a sovereign software stack tailored for GPU-based Exascale supercomputing.
It addresses the critical challenges of software portability and performance
optimization across diverse hardware architectures, ensuring seamless adaptation of scientific
applications to future computing infrastructures. By integrating and enhancing existing open-source
frameworks, KOKTAILS will provide a robust middleware layer enabling French and European
applications to fully exploit Exascale resources while reducing dependence on foreign software ecosystems.
This project aligns fully with the PEPR NumPEx strategy, closely interacting with Exa-SofT's work
on software tool evolution and Exa-DI to ensure integration into application demonstrators.
By ensuring the sustainability of software developments and facilitating their adoption by a wide range of
applications, KOKTAILS will directly contribute to France's digital sovereignty and to scientific and
technological excellence in HPC.

Kokkos

Internationally, the United States has significantly invested in Exascale software development
through initiatives like the Exascale Computing Project (ECP), which has focused on co-design
efforts between hardware, software, and applications. Kokkos, an open-source C++ parallel
programming model, has emerged as a leading solution for portable performance across
heterogeneous architectures and is widely adopted in worldwide supercomputing centers.
Europe has made progress in HPC software development through programs like EuroHPC and
PEPR NumPEx, and needs to ensure that a production-ready software stack is ready for Exascale
architectures that will be deployed in member states. Although the Kokkos ecosystem is mature, it
lacks several key aspects to fully address the needs of the European computing communities.
Porting legacy codes with complex data structures remain a significant challenge and although
Kokkos is well-suited for GPUs, its use relies on advanced meta-programming, making its adoption
challenging for some scientists.

Mission confiée

Unstructured and high-dimensional meshes pose challenges for GPU optimization due to

irregular memory access, load imbalance, and inefficient parallelism. Techniques like Reverse Cuthill-
McKee (RCM) reordering, optimal loop ordering, and hierarchical memory use aim to improve performance.
Adaptive mesh partitioning based on connectivity strength also helps reduce load imbalance
in domain decomposition. However, these strategies depend heavily on mesh topology,
numerical methods, and hardware, so no one-size-fits-all solution exists. Profiling and adaptive tuning
are essential to find optimal configurations. Libraries like
GMlib, OP2, and TNL offer support
for unstructured meshes on GPUs but lack tools for selecting the best optimization strategies.
Future work should focus on auto-tuning frameworks integrated with portability layers like Kokkos to
provide scalable, efficient solutions for Exascale computing.
The KOKTAILS project will address these limitations by:
- Extending Kokkos with enhanced support for European architectures,
ensuring its applicability in the French and European HPC landscape,
- Improving data structures in the Kokkos ecosystem to support specific meshes
required in key French and European applications,
- Improving automatic code translation and transformation tool, to facilitate the migration of
legacy scientific codes to modern GPU-optimized frameworks such as the Kokkos ecosystem.
- Addressing challenges in Python-Kokkos interoperability, enabling domain-specific
scientists to leverage Kokkos through a Python interface and enabling also a seamless
integration of Python codes ad AI models into C++ HPC codes for efficient execution on heterogeneous architectures.

Principales activités

Efficient mesh management is crucial for many scientific applications. We propose to develop optimized Kokkos
data structures for high-dimensional or unstructured meshes. These data structures aim to reduce
computational costs by leveraging optimized memory management for modern GPU-based architectures.
The innovation lies in designing mesh data structures that are both portable and adaptable
to the specific constraints of Exascale architectures, ensuring scalability and optimal efficiency.

- Some scientific applications - plasma physics, quantum simulations, turbulence modeling -
require 6D/7D data structures. Extend Kokkos views to support such
high-dimensional data while preserving performance portability. Key efforts include native support for
6D/7D views with optimized memory layout and indexing for GPUs, improved memory access for
efficiency across architectures, validation through benchmarks and demonstrators. These
enhancements will benefit Exascale-targeted scientific codes.
- Develops a flexible API for optimizing unstructured mesh algorithms on GPUs. It will
support both static and dynamic strategies, including mesh reordering (RCM, Morton, Hilbert) for
better cache locality, loop restructuring for optimized data access, hierarchical parallelism using
shared memory and registers, load balancing via connectivity-aware mesh partitioning, and race
condition management through partition coloring and atomics. The API will allow switching
between strategies based on code-specific patterns to maximize GPU efficiency.
- Create a Kokkos-based library for unstructured mesh processing.
It will offer predefined mesh structures (e.g., edge shells, ball of points), parallel
execution schemes for vectorized operations and efficient memory use, and multi-architecture
support (AMD, Intel, NVIDIA) via Kokkos backends (CUDA, HIP, SYCL, OpenMP). The library,
building on work from Exa-DI (PEPR NumPEx), will provide a scalable, portable solution for scientific
code adaptation to GPU-based Exascale systems.

Compétences

Strong scientific programming skills, particularly in modern C++.
Experience in parallel computing, including one or more of the following models:
MPI, OpenMP, CUDA, HIP, SYCL, or accelerated scientific libraries like Kokkos and RAJA.
Knowledge of modern HPC architectures, including:
GPU systems (NVIDIA, AMD), many-core architectures and complex memory hierarchies. Performance optimization and portability.
Understanding of numerical methods for PDEs (Finite Volumes, Finite Elements, implicit solvers)
and their efficient implementation on parallel architectures.

Avantages

- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage

Rémunération

Remunerating : in regards to professionel experiences

Postuler sur le site du recruteur

Parcourir plus d'offres d'emploi