Recrutement INRIA

Research Engineer In Operating Systems H/F - INRIA

  • Nice - 06
  • CDI
  • INRIA
Publié le 30 octobre 2025
Postuler sur le site du recruteur

Les missions du poste

A propos d'Inria

Inria est l'institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l'interface d'autres disciplines. L'institut fait appel à de nombreux talents dans plus d'une quarantaine de métiers différents. 900 personnels d'appui à la recherche et à l'innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.Research Engineer in Operating Systems
Le descriptif de l'offre ci-dessous est en Anglais
Contrat renouvelable : Oui

Niveau de diplôme exigé : Bac +5 ou équivalent

Fonction : Ingénieur scientifique contractuel

Contexte et atouts du poste

Modern multi-core servers rely on Non-Uniform Memory Access (NUMA) architectures, where performance is highly dependent on data locality. Operating systems have evolved mechanisms like AutoNUMA to migrate memory pages closer to the threads that access them. However, significant inefficiencies persist, particularly regarding the handling of transparent huge pages (THP).

A critical and well-known problem is false huge page sharing. With THP enabled by default, it is common for a single, large 2MB page to be accessed by threads running on different NUMA nodes. This concurrent access to different parts of the same page causes the entire page to repeatedly migrate between nodes, creating massive contention and performance degradation.

Current OS solutions like AutoNUMA are ill-equipped to handle this specific scenario. AutoNUMA's monitoring mechanism, which relies on randomly protecting pages and faulting, is too coarse-grained to accurately capture these complex access patterns. Furthermore, its default remedy is to move an entire page, whereas the correct solution for false huge page sharing is to split the huge page into smaller, regular-sized pages that can be managed independently.

Mission confiée

We are seeking a research engineer to design and implement a novel, fine-grained memory management system to solve this problem. The core idea is to leverage Intel Memory Protection Keys (MPK), a hardware feature that allows for more precise, low-overhead access monitoring.

Your mission will be to build a prototype that can intelligently detect and resolve false huge page sharing. Your primary responsibilities will begin with the design and implementation of a monitoring system that uses Intel MPK to enforce memory locality, associating each thread's MPK register with its current NUMA node ID. This will require kernel integration, likely involving Linux kernel patching or using eBPF, to manage and update these keys during context switches. You will then create the necessary access control logic. When a page protected with one node's key is accessed by a thread with a different node's key, a fault will occur, allowing your system to log the event. Based on these access patterns, you will develop a decision engine that, in cases of frequent page migration, makes the intelligent decision to split the problematic huge page into 4KB pages. Finally, you will be responsible for a thorough evaluation, profiling and benchmarking the new system against the standard Linux AutoNUMA implementation on various workloads to quantify performance gains and overhead.

Principales activités

We are looking for a candidate with a strong background in systems programming and operating systems. A M.S. or Ph.D. in Computer Science, or equivalent engineering experience, is required. The ideal candidate will possess excellent C programming skills and a deep understanding of OS internals, particularly memory management and process scheduling. Demonstrable experience with Linux kernel development, such as creating modules or applying patches, and/or experience with eBPF is essential. This role also requires strong knowledge of computer architecture, including NUMA systems and memory/cache hierarchies, as well as autonomy, strong problem-solving skills, and a proactive, research-oriented mindset.

Familiarity with system performance analysis and profiling tools, such as perf, would be a significant advantage. Prior experience with hardware performance features like Intel MPK or TSX, or with high-performance computing (HPC) applications and benchmarks, is also desired but not mandatory.

Avantages

- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage

Postuler sur le site du recruteur

Ces offres pourraient aussi vous correspondre.

Parcourir plus d'offres d'emploi