Docker AI Jobs

Discover the latest remote and onsite Docker AI roles across top active AI companies. Updated hourly.

Check out 252 new Docker AI roles opportunities posted on AI Chopping Block

Senior Platform/DevOps Engineer (Kubernetes-Linux)

New
Top rated
Armada
Full-time
Full-time
Posted

Translate business requirements into requirements for AI/ML models; prepare data to train and evaluate AI/ML/DL models; build AI/ML/DL models by applying state-of-the-art algorithms, especially transformers; leverage existing algorithms from academic or industrial research when applicable; test, evaluate, and benchmark AI/ML/DL models and publish the models, data sets, and evaluations; deploy models in production by containerizing them; work with customers and internal employees to refine model quality; establish continuous learning pipelines for models using online or transfer learning; build and deploy containerized applications on cloud or on-premise environments.

$154,560 – $193,200
Undisclosed
YEAR

(USD)

Bellevue, United States
Maybe global
Onsite
Python
Java
C++
Docker
Kubernetes

Software Engineer - Tools & Automation

New
Top rated
Zoox
Full-time
Full-time
Posted

As a Software Engineer and member of the Platform Stability team, you will help build, fine-tune, and maintain a novel AI-powered tool for diagnosing technical issues and identifying root causes. You will collaborate cross-functionally to gather requirements, develop AI/ML and analytical models, and drive data-driven insights as part of a high-performing team. Responsibilities include designing and implementing agentic AI systems with structured interfaces, reasoning loops, and robust error handling; building and maintaining data pipelines, scheduled workflows, and benchmarking infrastructure; developing evaluation and scoring systems to measure and improve model output quality; integrating the platform with internal and external services such as ticketing, messaging, storage, and observability; collaborating with cross-functional teams to translate business requirements into technical AI solutions; and architecting and maintaining production-grade AI solutions with a focus on scalability, reliability, and performance.

$184,000 – $231,000
Undisclosed
YEAR

(USD)

Foster City, United States
Maybe global
Onsite
Python
Prompt Engineering
Model Evaluation
Data Pipelines
MLOps

Tech Lead Manager, Data Infrastructure

New
Top rated
Cartesia
Full-time
Full-time
Posted

The Tech Lead Manager, Data Infrastructure at Cartesia is responsible for defining the overall multi-modal data strategy across pre-training and post-training, including human, synthetic, and web-scale data sources. They lead, manage, and mentor a team of data engineers and specialists. They design and oversee the construction of robust, scalable data pipelines for text, audio, and video, establish and enforce rigorous standards for data quality across the organization, deeply understand how data affects model capability and proactively identify and source novel datasets, and manage relationships and budgets with external data vendors and partners.

$250,000 – $375,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite
Python
Data Pipelines
MLOps
AWS
GCP

Mid/Senior/Staff Software Engineer, Agents

New
Top rated
Harvey
Full-time
Full-time
Posted

As a Software Engineer, Agents, you will build systems that make AI agents indispensable to legal professionals by designing environments and actions for agentic professional work, making model selection decisions, managing context windows, creating optimal tools, and developing evaluation harnesses for faster iteration loops to unlock new capabilities. You will partner with customers and product managers to understand legal workflows, design practical evaluations to capture what excellence means, and ship agents that effectively complete tasks. Additionally, you will optimize agent performance through prompt engineering, model selection, tool design, skill writing, context window management, and evaluation harness development. You will work with the model infrastructure team to design and implement infrastructure for low-latency agent execution, including caching strategies, parallel tool calls, or subagent patterns. Improving observability and instrumentation to profile agent behavior, identify bottlenecks, and drive optimization decisions is also part of the role. Staying current on new developments in agentic systems and applying those insights to product development is expected.

$165,000 – $312,000
Undisclosed
YEAR

(USD)

New York, United States
Maybe global
Onsite
Python
Prompt Engineering
Model Evaluation
OpenAI API
Transformers

Forward Deployed AI Engineer

New
Top rated
Talent Labs
Full-time
Full-time
Posted

Drive the end-to-end technical deployment of Latent Labs models into customer environments, ensuring seamless integration with existing scientific and IT infrastructure. Design and build production-grade API integrations, data pipelines and model-serving infrastructure tailored to each customer’s requirements. Work on-site or embedded with pharma and biotech partners to scope technical requirements, troubleshoot issues and deliver solutions. Ensure deployments meet enterprise standards for security, performance and reliability. Serve as the technical point of contact for assigned customers, building trusted relationships with their scientific and engineering teams, including spending time working on-site at international partner locations as needed. Gather and synthesise customer feedback, translating it into actionable insights for product, research and platform teams. Collaborate with internal teams to shape the product roadmap based on real-world deployment learnings. Create technical documentation, integration guides and best-practice resources for customers. Stay on top of the latest developments in ML infrastructure, model serving and cloud-native tooling. Gain a strong working understanding of protein and cell biology as it relates to the product. Participate in knowledge sharing, including organizing and presenting at internal reading groups.

Undisclosed

()

San Francisco, United States
Maybe global
Hybrid
Python
AWS
Docker
Kubernetes
CI/CD

Staff Engineer, G&C (R4763)

New
Top rated
Shield AI
Full-time
Full-time
Posted

As a Guidance and Controls engineer, you will be responsible for creating and maintaining all control and autonomy algorithms within the XBAT code base. This includes algorithm development, unit tests, component tests, flight software qualification, and flight test support. You will also be responsible for helping update and validate the truth models as required.

$180,000 – $280,000
Undisclosed
YEAR

(USD)

Dallas, United States
Maybe global
Onsite
Python
C++
CI/CD
MLOps
Docker

Director, Data Center Operations

New
Top rated
Together AI
Full-time
Full-time
Posted

The responsibilities include advancing inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implementing and maintaining changes in high-performance inference engines, including kernel backends and speculative decoding, profiling and optimizing performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Unifying inference with RL/post-training by designing and operating RL and post-training pipelines, making RL and post-training workloads more efficient with inference-aware training loops, and using these pipelines to train, evaluate, and iterate on frontier models. Co-designing algorithms and infrastructure so that objectives, rollout collection, and evaluation are tightly coupled to efficient inference, identifying bottlenecks across the training engine, inference engine, data pipeline, and user-facing layers. Running ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, and feeding these insights back into model, RL, and system design. Owning critical systems at production scale by profiling, debugging, and optimizing inference and post-training services under real production workloads, driving roadmap items requiring engine modification, and establishing metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Providing technical leadership by setting technical direction for cross-team efforts at the intersection of inference, RL, and post-training, and mentoring other engineers and researchers on full-stack ML systems work and performance engineering.

$200,000 – $280,000
Undisclosed
YEAR

(USD)

San Francisco
Maybe global
Onsite
Python
PyTorch
TensorFlow
MLOps
Model Evaluation

Regional Sales Lead, Singapore

New
Top rated
Tenstorrent
Full-time
Full-time
Posted

Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, placement and routing (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and machine learning-based techniques, working closely with internal teams such as verification, extraction, timing, Design for Test (DFT), and electronic design automation (EDA) vendors.

$100,000 – $500,000
Undisclosed
YEAR

(USD)

Santa Clara or Austin or Fort Collins, United States
Maybe global
Remote
Python
PyTorch
TensorFlow
MLOps
Docker

Forward Deployed Engineer - Sydney

New
Top rated
OpenAI
Full-time
Full-time
Posted

Forward Deployed Engineers lead complex end-to-end deployments of frontier models in production alongside strategic customers, owning discovery, technical scoping, system design, build, and production rollout while partnering with customer engineering and domain teams. They own technical delivery across multiple deployments from prototype to stable production, build full-stack systems to deliver customer value, embed closely with customer teams to understand needs and guide adoption, scope work, sequence delivery, and remove blockers early. They make trade-offs between scope, speed, and quality, contribute directly in the code when needed, codify working patterns into reusable tools and playbooks, share field feedback to help Research and Product improve models, and keep teams moving through clarity and follow-through.

Undisclosed

()

Sydney, Australia
Maybe global
Hybrid
Python
JavaScript
Prompt Engineering
Model Evaluation
MLOps

Staff Analytics Engineer — Data Warehouse

New
Top rated
Together AI
Full-time
Full-time
Posted

Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines, including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate RL and post-training pipelines where most cost is inference, jointly optimizing algorithms and systems. Make RL and post-training workloads more efficient with inference-aware training loops, async RL rollouts, and speculative decoding. Use these pipelines to train, evaluate, and iterate on frontier models. Co-design algorithms and infrastructure to tightly couple objectives, rollout collection, and evaluation with efficient inference, identifying bottlenecks across the training engine, inference engine, data pipeline, and user-facing layers. Run ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights into model, RL, and system design. Profile, debug, and optimize inference and post-training services under real production workloads. Drive roadmap items requiring engine modification such as changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership to set direction for cross-team efforts in inference, RL, and post-training and mentor engineers and researchers on full-stack ML systems work and performance engineering.

$200,000 – $280,000
Undisclosed
YEAR

(USD)

San Francisco
Maybe global
Onsite
Python
PyTorch
TensorFlow
MLOps
Model Evaluation

Want to see more AI Egnineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Need help with something? Here are our most frequently asked questions.

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What are Docker AI jobs?","answer":"Docker AI jobs involve developing, deploying, and maintaining AI applications using containerization technology. These positions focus on creating reproducible AI workflows, packaging machine learning models with dependencies, and ensuring consistent execution across environments. Professionals in these roles typically work on MLOps pipelines, containerized AI applications, and implement solutions that seamlessly transition from development to production."},{"question":"What roles commonly require Docker skills?","answer":"Machine Learning Engineers, Data Scientists, AI Developers, and DevOps Engineers working on AI systems commonly require containerization skills. These professionals use containers to package models, ensure reproducibility, and streamline deployment pipelines. Full-stack developers building AI-powered applications and MLOps specialists implementing continuous integration workflows also frequently need proficiency with containerized environments and deployment strategies."},{"question":"What skills are typically required alongside Docker?","answer":"Alongside containerization expertise, employers typically seek proficiency in AI frameworks like TensorFlow, PyTorch, and Hugging Face. Familiarity with Docker Compose for multi-container applications, version control systems, and CI/CD pipelines is essential. Additional valuable skills include YAML configuration, cloud deployment knowledge, GPU acceleration techniques, and experience with MLOps practices that facilitate model development, testing, and production deployment."},{"question":"What experience level do Docker AI jobs usually require?","answer":"AI positions requiring containerization skills typically seek mid-level professionals with 2-4 years of practical experience. Entry-level roles may accept candidates with demonstrated proficiency in basic container commands, Dockerfile creation, and image management. Senior positions often demand extensive experience integrating containers into production ML pipelines, optimizing container resources, and implementing advanced deployment strategies across cloud and edge environments."},{"question":"What is the salary range for Docker AI jobs?","answer":"Compensation for AI professionals with containerization expertise varies based on location, experience level, industry, and additional technical skills. Junior roles typically start at competitive market rates, while senior positions command premium salaries. The most lucrative opportunities combine deep learning expertise, container orchestration experience, and cloud platform knowledge. Specialized industries like finance or healthcare often offer higher compensation for these in-demand skill combinations."},{"question":"Are Docker AI jobs in demand?","answer":"Containerization skills remain highly sought after in AI development, with strong demand driven by organizations implementing MLOps practices and scalable AI deployment strategies. Recent partnerships like Anaconda-Docker and trends in serverless AI containers have intensified hiring needs. The emergence of specialized tools like Docker Model Runner, Docker Offload, and Docker AI Catalog reflects the growing importance of containerized workflows in modern AI development and deployment practices."},{"question":"What is the difference between Docker and Kubernetes in AI roles?","answer":"In AI roles, containerization focuses on packaging individual applications with dependencies for consistent execution, while Kubernetes orchestrates multiple containers at scale. ML engineers might use Docker to create reproducible model environments but implement Kubernetes to manage production deployments across clusters. While containerization handles the model packaging, Kubernetes addresses the scalability, load balancing, and automated recovery needed for production AI systems serving multiple users simultaneously."}]