Location
New York, United States
New York, United States
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
$150,000 – $350,000
Date posted
April 21, 2026
Job type
Full-time
Experience level
Mid Level 5+
Summary this job with AI
Highlight
Highlight

Job Description

About Us:

Modal provides the infrastructure foundation for AI teams. With instant GPU access, sub-second container startups, and native storage, Modal makes it simple to train models, run batch jobs, and serve low-latency inference. We have thousands of customers who rely on us for production AI workloads, including Lovable, Scale AI, Substack, and Suno.

We're a fast-growing team based out of NYC, SF, and Stockholm. We've hit 9-figure ARR and recently raised a Series B at a $1.1B valuation. Our investors include Lux Capital, Redpoint Ventures, Amplify Partners, and Elad Gil.

Working at Modal means joining one of the fastest-growing AI infrastructure organizations at an early stage, with many opportunities to grow within the company. Our team includes creators of popular open-source projects (e.g. Seaborn, Luigi), academic researchers, international olympiad medalists, and experienced engineering and product leaders with decades of experience.

The Role

We are looking for strong engineers with experience in making ML systems performant at scale. If you are interested in contributing to open-source projects and Modal’s container runtime to push language and diffusion models towards higher throughput and lower latency, we’d love to hear from you!

Requirements

  • 5+ years of experience writing high-quality, high-performance code.

  • Experience working with torch, high-level ML frameworks, and inference engines (vLLM or TensorRT).

  • Familiarity with Nvidia GPU architecture and CUDA.

  • Experience with ML performance engineering (tell us a story about boosting GPU performance — debugging SM occupancy issues, rewriting an algorithm to be compute-bound, eliminating host overhead, etc).

  • Nice-to-have: familiarity with low-level operating system foundations (Linux kernel, file systems, containers, etc).

Apply now
Modal is hiring a Member of Technical Staff - ML Performance. Apply through The AI Chopping Block and and make the next move in your career!
Apply now
Companies size
51-100
employees
Founded in
Headquaters
San Francisco, CA, United States
Country
United States
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

Member of Technical Staff - ML Performance

Full-time
Machine Learning Engineer
IT.svg
Italy

AI/ML Engineer, Rome

Full-time
Machine Learning Engineer
FR.svg
France

AI/ML Engineer, Paris

Full-time
Machine Learning Engineer
ES.svg
Spain

AI/ML Engineer, Madrid

Full-time
Machine Learning Engineer
GB.svg
United Kingdom

AI/ML Engineer, London

Full-time
Machine Learning Engineer
GE.svg
Germany

AI/ML Engineer, Berlin

Full-time
Machine Learning Engineer
Open Modal