AI Platform Engineer Jobs

Discover the latest remote and onsite AI Platform Engineer roles across top active AI companies. Updated hourly.

Check out 14 new AI Platform Engineer opportunities posted on The Homebase

Field Engineering Manager, Public Sector

New
Top rated
Scale AI
Full-time
Full-time
Posted

As a Production AI Ops Lead, you will design and develop the production lifecycle of full-stack AI applications, support end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and resilient cloud infrastructure for international government partners. Responsibilities include owning the production outcome with full accountability for long-term performance and reliability of AI use cases across international government agencies, ensuring full-stack integrity by overseeing all platform components from APIs to UI for a production-ready environment, building automated systems to monitor model performance and data drift across dispersed environments, managing the technical lifecycle within diverse regulatory frameworks, leading incident response in mission-critical environments with rapid resolution and prevention guardrails, translating technical performance metrics into clear insights for senior government officials, and partnering with engineering and ML teams to influence the technical architecture and decisions for future AI use cases.

Undisclosed

()

San Francisco or St. Louis or New York or Washington, United States
Maybe global
Onsite

Aerodynamics Methodology and Software Engineer

New
Top rated
Harmattan AI
Full-time
Full-time
Posted

Refactor research scripts and specialist tools into modular, high-performance, and maintainable Python/C++ libraries, implementing robust unit-testing and documentation standards, and ensuring the team follows code development structure. Architect agentic workflows and custom MCP servers to connect LLMs with internal CFD solvers and databases, codifying engineering knowledge into structured files to enable AI-driven code refactoring, automated simulation setup, and intelligent data analysis. Develop APIs and automated workflows to integrate tools like OpenVSP, XFoil, and OpenFOAM into seamless optimization loops. Manage and optimize Linux-based HPC clusters and/or Cloud computing infrastructure. Design the data architecture for storing and retrieving aerodynamic results to provide vehicle performance data as a single source of truth for GNC and flight physics teams.

Undisclosed

()

Lausanne, Switzerland
Maybe global
Onsite

Software/AI Engineer (New Grad)

New
Top rated
FurtherAI
Full-time
Full-time
Posted

Develop, test, and deploy production-level code across backend and AI systems. Collaborate with AI researchers to integrate and optimize large language models for insurance workflows. Build data processing and evaluation pipelines for unstructured document inputs such as PDFs, emails, and images. Contribute to core infrastructure including APIs and orchestration logic powering the AI Workspace for Insurance. Work cross-functionally with product and customer teams to identify and solve real business problems using AI. Participate in design reviews, code reviews, and rapid iteration cycles.

$125,000 – $165,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite

Software Engineer, AI Platform

New
Top rated
Harvey
Full-time
Full-time
Posted

Design and build abstractions and platform-level systems that improve all of Harvey’s agentic products; own infrastructure for model integration, routing, and evaluation that helps Harvey choose and deploy the right foundation model for any given context; build evaluation frameworks and tooling that let every team across Harvey iterate on AI quality effectively; partner closely with product engineering teams, PMs, and design to launch cutting-edge AI products; evaluate, prototype, and integrate the latest advancements in AI and agentic systems as they emerge.

CA$154,000 – CA$264,000
Undisclosed
YEAR

(CAD)

Toronto, Canada
Maybe global
Remote

Forward Deployed Engineer, Agentic Platform (Public Sector)

New
Top rated
Cohere
Full-time
Full-time
Posted

Build and ship features for North, Cohere's AI workspace platform; develop autonomous agents that interact with sensitive enterprise data; experiment rapidly and with high quality to engage customers and deliver solutions that exceed expectations; work across the entire product lifecycle from conceptualization to production; lead end-to-end deployment of North in private cloud and on-premises environments, including planning, configuration, testing, and rollout.

Undisclosed

()

Ottawa, Canada
Maybe global
Remote

Software Engineer, AI Platform

New
Top rated
Harvey
Full-time
Full-time
Posted

Design and build abstractions and platform-level systems that improve all of Harvey’s agentic products. Own infrastructure for model integration, routing, and evaluation that helps Harvey choose and deploy the right foundation model for any given context. Build evaluation frameworks and tooling that let every team across Harvey iterate on AI quality effectively. Partner closely with product engineering teams, PMs, and design to launch cutting-edge AI products. Evaluate, prototype, and integrate the latest advancements in AI and agentic systems as they emerge.

$220,000 – $300,000
Undisclosed
YEAR

(USD)

New York, United States
Maybe global
Onsite

Senior Brand Events Manager

New
Top rated
Grammarly
Full-time
Full-time
Posted

Own the observability and lifecycle management of AI features across the organization. Build tools and infrastructure to enable teams to develop, monitor, and optimize LLM-powered features. Design and implement closed-loop evaluation pipelines that automatically validate prompt changes. Develop comprehensive metrics and dashboards to track LLM usage: cost per feature, token patterns, and latency. Create systems that tie user feedback to specific prompts and LLM calls. Establish best practices and processes for the full lifecycle of prompts: development, testing, deployment, and monitoring. Collaborate with engineering teams across the organization to ensure they have the tools and visibility needed to build high-quality AI features.

$103,000 – $174,000
Undisclosed
YEAR

(USD)

United States
Maybe global
Onsite

Principal AI Ops Architect, IPS

New
Top rated
Scale AI
Full-time
Full-time
Posted

As a Production AI Ops Lead, you will design and develop the production lifecycle of full-stack AI applications, support end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and resilient cloud infrastructure for international government partners. You will take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies. You will oversee the end-to-end health of the platform ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment. Build automated systems to monitor model performance and data drift across geographically dispersed environments ensuring reliability. Manage the technical lifecycle within diverse regulatory frameworks. Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building guardrails to prevent recurrence. Translate deep technical performance metrics into clear insights for senior international government officials and partner with Engineering and ML teams to ensure field lessons influence the technical architecture and future use cases.

Undisclosed

()

Doha or London, Qatar or United Kingdom
Maybe global
Onsite

Senior Product Designer, Mobile

New
Top rated
Grammarly
Full-time
Full-time
Posted

Own the observability and lifecycle management of AI features across the organization. Build tools and infrastructure to enable teams to develop, monitor, and optimize LLM-powered features. Design and implement closed-loop evaluation pipelines that automatically validate prompt changes. Develop comprehensive metrics and dashboards to track LLM usage, including cost per feature, token patterns, and latency. Create systems that tie user feedback to specific prompts and LLM calls. Establish best practices and processes for the full lifecycle of prompts, including development, testing, deployment, and monitoring. Collaborate with engineering teams across the organization to ensure they have the tools and visibility needed to build high-quality AI features.

$103,000 – $128,000
Undisclosed
YEAR

(USD)

United States, Canada, Mexico, Brazil, Argentina
Maybe global
Remote

Lazo - Head of Engineering

New
Top rated
Silver.dev
Full-time
Full-time
Posted

The Head of Engineering at Lazo is responsible for owning the technology strategy and roadmap aligned with business and product OKRs, defining the reference architecture for agentic systems, establishing security and compliance baselines including SOC2-readiness, and presenting trade-offs, risks, and progress in leadership reviews. They are also tasked with shipping backend services in Python/TypeScript, driving high-impact PRs and code reviews, orchestrating agents and toolchains, integrating external APIs and databases, and building robust pipelines. The role includes end-to-end DevOps responsibilities such as AWS/GCP management, containerization, IaC, CI/CD, observability, and on-call design, as well as reducing technical debt, improving latency and throughput, and managing infrastructure costs. The individual defines SLOs and error budgets, reduces MTTR and change-fail rates, implements data access policies and secure data flows for AI features, drives post-mortems and preventive engineering practices, hires and mentors engineers, sets performance scorecards with integrated operating systems, fosters a culture of thoughtful trade-offs and fast feedback, partners with Product and AI teams to turn customer problems into scalable solutions, collaborates with Ops, Growth, and Customer teams for reliability and launch readiness, and manages vendors and evaluates build-vs-buy decisions.

$72,000 – $96,000
Undisclosed
YEAR

(USD)

Argentina
Maybe global
Remote

Want to see more AI Platform Engineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Have questions about roles, locations, or requirements for AI Platform Engineer jobs?

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What does a AI Platform Engineer do?","answer":"AI Platform Engineers develop and maintain the infrastructure that supports machine learning workloads. They collaborate with data scientists and software engineers to deploy, manage, and optimize AI models. Their responsibilities include implementing automation pipelines, ensuring 99.9% uptime of AI services, and establishing monitoring systems. They also design platform architectures for model training and deployment at scale, while maintaining security and governance standards."},{"question":"What skills are required for AI Platform Engineer?","answer":"Key skills for AI Platform Engineers include proficiency with cloud platforms (AWS, GCP, Azure), containerization tools like Docker and Kubernetes, and CI/CD pipelines. Experience with MLOps tools such as MLflow, SageMaker, and Azure ML is essential. Strong programming abilities in Python and knowledge of ML frameworks like TensorFlow are valuable. Infrastructure automation skills and understanding of distributed systems are also critical for success in this specialized engineering role."},{"question":"What qualifications are needed for AI Platform Engineer role?","answer":"Most AI Platform Engineer positions require a bachelor's degree in Computer Science, Engineering, or a related technical field. Employers typically look for at least 3 years of experience in platform engineering, DevOps, or AI/ML infrastructure roles. Cloud computing certifications from AWS, GCP, or Azure are highly valued. Practical experience with containerization, MLOps practices, and data pipelines is essential to demonstrate proficiency in building robust AI infrastructure."},{"question":"What is the salary range for AI Platform Engineer job?","answer":"The research provided doesn't include specific salary information for AI Platform Engineers. Compensation typically varies based on location, experience level, company size, and industry. Given the specialized technical knowledge required for AI infrastructure and the critical nature of maintaining high-availability platforms for machine learning workloads, these positions often command competitive salaries in the technology sector."},{"question":"How long does it take to get hired as a AI Platform Engineer?","answer":"The hiring timeline for AI Platform Engineer positions isn't specified in the research. The hiring process typically involves technical assessments of cloud platform knowledge, containerization skills, and MLOps experience. With employers requiring 3+ years of related experience, candidates usually need to demonstrate proficiency in multiple technical domains. Those with relevant backgrounds in platform engineering, DevOps, or ML infrastructure may transition more quickly into these specialized AI jobs."},{"question":"Are AI Platform Engineer job in demand?","answer":"Yes, AI Platform Engineer roles are in high demand as organizations prioritize AI readiness and enterprise-scale adoption through 2026. These specialists are crucial for building the infrastructure necessary to deploy and scale AI capabilities. The specialized knowledge of both platform engineering and AI/ML workloads makes qualified candidates particularly valuable as companies seek to maintain 99.9% uptime for critical AI services while scaling their machine learning operations."}]