Location
Paris, France
Paris, France
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
Category
AI Engineer
Date posted
May 1, 2026
Job type
Full-time
Experience level
Senior 8+
Summary this job with AI
Highlight
Highlight

Job Description

What You’ll Do

  • Build low-latency inference pipelines for on-device deployment, enabling real-time next-token and diffusion-based control loops in robotics

  • Design and optimize distributed inference systems on GPU clusters, pushing throughput with large-batch serving and efficient resource utilization

  • Implement efficient low-level code (CUDA, Triton, custom kernels) and integrate it seamlessly into high-level frameworks

  • Optimize workloads for both throughput (batching, scheduling, quantization) and latency (caching, memory management, graph compilation)

  • Develop monitoring and debugging tools to guarantee reliability, determinism, and rapid diagnosis of regressions across both stacks

What You’ll Bring

  • Deep experience in distributed systems, ML infrastructure, or high-performance serving (8+ years)

  • Production-grade expertise in Python, with strong background in systems languages (C++/Rust/Go)

  • Low-level performance mastery: CUDA, Triton, kernel optimization, quantization, memory and compute scheduling

  • Proven track record scaling inference workloads in both throughput-oriented cluster environments and latency-critical on-device deployments

  • System-level mindset with a history of tuning hardware–software interactions for maximum efficiency, throughput, and responsiveness

Apply now
Genesis AI is hiring a Member of Technical Staff, Inference (Paris, London). Apply through The AI Chopping Block and and make the next move in your career!
Apply now
Companies size
11-50
employees
Founded in
Headquaters
Country
Industry
Research
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

ZA.svg
South Africa

Freelance Agent Evaluation Engineer

Part-time
AI Engineer
US.svg
United States

Software Engineer, Data Infrastructure

Full-time
AI Engineer
No items found.

Research Intern RL & Post-Training Systems, Turbo (Fall 2026)

Full-time
AI Engineer
US.svg
United States

Forward Deployed AI Engineer - Enterprise Lead

Full-time
AI Engineer
US.svg
United States

Forward Deployed AI Engineer - Lead

Full-time
AI Engineer
US.svg
United States

AI Engineer, Evaluation

Full-time
AI Engineer
Open Modal