Kubernetes AI Jobs

Discover the latest remote and onsite Kubernetes AI roles across top active AI companies. Updated hourly.

Check out 301 new Kubernetes AI roles opportunities posted on AI Chopping Block

Full Stack Software Engineer, Codex

New
Top rated
OpenAI
Full-time
Full-time
Posted

Build end-to-end product experiences that span frontend applications, backend services, agent workflows, cloud infrastructure, and developer tooling. Design AI-powered workflows that generalize across a wide variety of software engineering teams, languages, codebases, and development practices. Discover and implement novel ways to apply AI to eliminate friction throughout the software development lifecycle. Partner closely with product, design, and research to understand developer needs and rapidly translate insights into shipped product improvements. Work directly with users—including developers at OpenAI, open-source contributors, startups, and large enterprises—to understand pain points and validate solutions. Improve the reliability, observability, scalability, and performance of the systems and workflows you build.

$255,000 – $405,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Remote
TypeScript
Python
Docker
Kubernetes
CI/CD

AI Field Engineer - Enterprise

New
Top rated
Fireworks AI
Full-time
Full-time
Posted

AI Field Engineers at Fireworks embed with customers and technology partners to turn complex AI problems into production systems quickly. Responsibilities include building POCs, MVPs, and production integrations; shipping code; running benchmarks; debugging production issues; and architecting deployments. They lead discovery conversations, align stakeholders, and translate customer pain points into product improvements. Engineers spend most of their time on-site with customers, building relationships and trust in person. They work specifically on technical delivery and deployment by building end-to-end POCs and MVPs inside customer codebases, architecting inference foundations, running load tests, tuning deployments, and deploying new model families on inference frameworks. They guide customers on model selection and fine-tuning strategies, build and run fine-tuning pipelines, and design evaluation frameworks. They engage in structured discovery conversations, own technical relationships from engagement to deployment, and spend time on-site embedded with customer teams. Finally, they identify recurring customer pain points, propose product improvements, codify deployment patterns, and feed customer signals back into the product roadmap.

$200,000 – $260,000
Undisclosed
YEAR

(USD)

New York or San Mateo, United States
Maybe global
Hybrid
Python
Kubernetes
AWS
Azure
GCP

Member of Technical Staff

New
Top rated
Fireworks AI
Full-time
Full-time
Posted

AI Field Engineers at Fireworks embed with customers and technology partners to turn complex AI problems into production systems. They build POCs, MVPs, and production integrations, ship code, run benchmarks, debug production issues, and architect deployments. They also lead discovery conversations, align stakeholders, and translate customer pain points into product improvements. The role involves spending time on-site with customers to build relationships and trust. Responsibilities include building end-to-end POCs and MVPs with customer engineering teams, architecting inference foundations and sizing deployments for GenAI core products, running load tests to establish performance baselines, tuning deployments, deploying and validating new model families, guiding customers on model selection and fine-tuning strategies, building fine-tuning pipelines, designing evaluation frameworks, leading discovery conversations, owning technical relationships from first engagement to production deployment, and feeding customer signals back into the product roadmap. They also codify repeatable deployment patterns and contribute to internal tooling, documentation, and platform improvements.

$200,000 – $260,000
Undisclosed
YEAR

(USD)

New York, United States
Maybe global
Hybrid
Python
Kubernetes
AWS
Azure
GCP

AI Field Engineer - Microsoft Foundry

New
Top rated
Fireworks AI
Full-time
Full-time
Posted

AI Field Engineers at Fireworks embed with customers and technology partners to turn complex AI problems into production systems quickly. They build POCs, MVPs, and production integrations, participate in executive-level discussions about architecture, strategy, and business outcomes. Responsibilities include shipping code, running benchmarks, debugging production issues, architecting deployments, leading discovery conversations, aligning stakeholders, and translating customer pain points into product improvements. They work on technical delivery and deployment by building end-to-end POCs and MVPs inside customer codebases and infrastructure, architecting inference foundations, sizing deployments for scale, running load tests, and tuning deployments to meet latency, throughput, and cost targets. They deploy and validate new model families on inference frameworks, determining optimal configurations and serving patterns. They guide customers in model selection, fine-tuning strategy, and evaluation methodology, build and run fine-tuning pipelines, and design evaluation frameworks for production metrics. They also manage customer engagement by leading discovery conversations, owning the technical relationship, embedding with customer engineering teams on-site, and building trust in person. Lastly, they provide product feedback by identifying recurring pain points, proposing product improvements, codifying deployment patterns, contributing to internal tooling and documentation, and feeding customer signals back into the product roadmap with specificity and urgency.

$200,000 – $260,000
Undisclosed
YEAR

(USD)

San Mateo, United States
Maybe global
Onsite
Python
Kubernetes
AWS
Azure
GCP

Director, Revenue Strategy & Analytics

New
Top rated
Fireworks AI
Full-time
Full-time
Posted

As an AI Field Engineer, responsibilities include embedding with customers and technology partners to convert complex AI problems into production systems quickly. The role involves hands-on development by building proofs of concept (POCs), minimum viable products (MVPs), and production integrations. Duties comprise shipping code, running benchmarks, debugging production issues, and architecting deployments. Leading discovery conversations, aligning stakeholders, and translating customer pain points into product improvements are part of the role. Specifically, the engineer builds end-to-end POCs and MVPs inside customer codebases and infrastructure, architects inference foundations for GenAI core products, sizes scalable deployments, runs load tests to establish performance baselines, tunes deployments, and deploys models on inference frameworks while optimizing configurations. The role also includes guiding customers on model selection and fine-tuning strategies, building fine-tuning pipelines, designing evaluation frameworks, and leading engagements to embed deeply with customer teams. Field Engineers spend time on-site to build trust, identify recurring customer pain points, translate these into product proposals, codify deployment patterns to contribute back to internal tooling and platform improvements, and feed customer feedback into the product roadmap with specificity and urgency.

$200,000 – $260,000
Undisclosed
YEAR

(USD)

San Mateo, United States
Maybe global
Hybrid
Python
Kubernetes
AWS
Azure
GCP

Paid Growth Marketer

New
Top rated
Fireworks AI
Full-time
Full-time
Posted

AI Field Engineers at Fireworks embed with ambitious customers and technology partners to turn complex AI problems into production systems quickly. They build proofs of concept (POCs), MVPs, and production integrations by shipping code, running benchmarks, debugging production issues, and architecting deployments. They lead discovery conversations, align stakeholders, and translate customer pain points into product improvements, compressing the feedback loop from field to roadmap. The role involves being on-site with customers to build strong relationships and trust. Responsibilities include building end-to-end POCs and MVPs alongside customer engineering teams within their codebases and infrastructure; architecting inference foundations for GenAI core products and sizing deployments for scalability; running load tests and tuning deployments for latency, throughput, and cost targets; deploying and validating new model families on inference frameworks, optimizing shapes, quantization, and serving patterns; guiding customers on model selection, fine-tuning strategies, and evaluation methodologies; building and running fine-tuning pipelines while balancing model families, compute cost, and quality targets; designing evaluation frameworks that measure production-quality metrics; leading structured discovery conversations to understand customer pain points and proposing solutions; owning the technical relationship from first engagement through deployment; spending time on-site embedding with customers; identifying recurring customer pain points and translating them into product proposals; codifying repeatable deployment patterns and contributing to internal tooling and documentation; and feeding back customer signals into the product roadmap with specificity and urgency.

$200,000 – $260,000
Undisclosed
YEAR

(USD)

San Mateo, United States
Maybe global
Hybrid
Python
Kubernetes
AWS
Azure
GCP

Member of Technical Staff (Machine Learning Engineer)

New
Top rated
Reka
Full-time
Full-time
Posted

Translate cutting-edge research into production-ready machine learning systems. Design, build, and deploy end-to-end ML models and pipelines. Develop and optimize models for image and video processing. Own the full ML lifecycle including experimentation, training/fine-tuning, evaluation, and deployment. Rapidly prototype using open-source models and adapt them for product needs. Conduct experiments, analyze results, and iterate to improve performance. Collaborate with researchers and cross-functional teams (product, engineering, design) to deliver ML solutions at scale. Participate with advancements in machine learning and apply them to continuously improve products.

Undisclosed

()

Maybe global
Remote
Python
Java
C++
PyTorch
TensorFlow

Senior AI Agent Engineer (Intelligence Service)

New
Top rated
42dot
Full-time
Full-time
Posted

The Senior AI Agent Engineer on the Intelligence Service team is responsible for designing and refining the RAG-based agent flow of an interactive knowledge agent, covering the process from query understanding to planning, tool routing, retrieval, and response generation. They optimize multi-turn conversation understanding and retrieval linkage, implement response quality control logics including grounding, answer verification, guardrails, and fallback mechanisms to defend against hallucination, and establish evaluation harnesses, regression testing, and A/B testing systems for answer quality in terms of faithfulness and relevancy. They also build backend infrastructure necessary for production operations such as API contracts, caching, configuration/prompt registry, and admin APIs. Furthermore, they analyze and improve response quality, latency, and failure cases through operational logs and quality metrics. The role includes leading design reviews and technical decision-making within the team, connecting complex problems to reusable system improvements as a senior technical pillar of the team.

Undisclosed

()

Pangyo, South Korea
Maybe global
Remote
Python
RAG
LangChain
OpenAI API
MLOps

Manager, Deployment Engineering

New
Top rated
Armada
Full-time
Full-time
Posted

The responsibilities include translating business requirements into requirements for AI/ML models, preparing data to train and evaluate AI/ML/DL models, building AI/ML/DL models using state-of-the-art algorithms especially transformers, testing and evaluating the AI/ML/DL models, publishing the models, datasets, and evaluations, deploying models in production by containerizing them, working with customers and internal employees to refine model quality, establishing continuous learning pipelines for models with online or transfer learning, and building and deploying containerized applications on cloud or on-premise environments.

$154,560 – $193,200
Undisclosed
YEAR

(USD)

Bellevue
Maybe global
Remote
Python
Java
C++
PyTorch
TensorFlow

TLM, Integrity

New
Top rated
OpenAI
Full-time
Full-time
Posted

Architect and build next-generation system protections through hands-on design, model training, and deployment strategies. Lead and manage a small, senior team of Engineers, providing clear direction and autonomy. Collaborate with Research, Safety, Product, and Policy teams to use existing tools and advance new solutions. Utilize state-of-the-art models to detect and prevent problematic content. Establish evaluation frameworks and metrics to measure progress and identify improvement areas. Support team growth and maintain high performance through mentorship and career guidance.

$347,000 – $490,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite
Python
Model Evaluation
MLOps
Docker
Kubernetes

Want to see more AI Egnineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Need help with something? Here are our most frequently asked questions.

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What are Kubernetes AI jobs?","answer":"Kubernetes AI jobs involve orchestrating containerized machine learning applications at scale. Professionals in these roles manage container deployment for AI workloads, distribute computational tasks across nodes for model training, allocate GPU resources efficiently, and automate ML pipelines. They typically work with frameworks like TensorFlow and PyTorch while ensuring high availability for production AI systems through automated scaling and self-healing capabilities."},{"question":"What roles commonly require Kubernetes skills?","answer":"Roles requiring Kubernetes skills include Machine Learning Engineers who deploy models to production, MLOps Engineers working with platforms like Kubeflow, Data Engineers managing processing pipelines, Platform Engineers supporting agentic AI applications, DevOps/SRE professionals handling containerized deployments, and Cloud Architects designing scalable environments. These positions typically involve maintaining infrastructure that supports the complete machine learning lifecycle."},{"question":"What skills are typically required alongside Kubernetes?","answer":"Alongside Kubernetes, employers typically look for container fundamentals (especially Docker), distributed systems knowledge, CI/CD pipeline experience, and cloud platform familiarity. Programming skills are essential for deployment scripts, while experience with ML frameworks like TensorFlow or PyTorch is valuable for AI-specific implementations. Understanding storage solutions, Kubernetes operators, and automated infrastructure management rounds out the typical skill requirements."},{"question":"What experience level do Kubernetes AI jobs usually require?","answer":"Kubernetes AI jobs typically require mid to senior-level experience. Employers look for professionals who understand containerization concepts, have worked with distributed systems, and can manage complex ML workflows. Prior exposure to cloud environments where Kubernetes runs is important. Candidates should demonstrate practical experience with CI/CD pipelines and familiarity with at least one major ML framework."},{"question":"What is the salary range for Kubernetes AI jobs?","answer":"Kubernetes AI jobs command competitive salaries due to the specialized intersection of container orchestration and machine learning skills. Compensation varies based on experience level, location, and specific industry. Roles requiring both strong AI expertise and Kubernetes infrastructure management typically offer premium compensation compared to general software engineering positions, reflecting the high market value of these combined skill sets."},{"question":"Are Kubernetes AI jobs in demand?","answer":"Kubernetes AI jobs are in high demand as organizations increasingly adopt containerized applications for machine learning workloads. The growth is driven by enterprises scaling their AI operations, edge computing applications, and the need for platform-agnostic infrastructure. Companies seek professionals who can manage the complexity of distributed ML systems, particularly for high-availability production environments and automated ML pipelines."},{"question":"What is the difference between Kubernetes and Docker in AI roles?","answer":"Docker creates containerized applications while Kubernetes orchestrates those containers at scale. In AI roles, Docker is used to package ML applications with their dependencies, while Kubernetes manages deployment across clusters, automates scaling during training, and handles resource allocation for GPUs. Docker provides consistency between environments, while Kubernetes adds critical production capabilities like load balancing, self-healing, and distributed computing for AI workloads."}]