Backend Engineer- Inference Services
The Backend Engineer is responsible for leading the design and implementation of Deepgram's products, specifically developing secure, robust, and scalable services for speech processing, distributed compute orchestration, and optimized scheduling. Responsibilities include improving Deepgram's core inference services in networking, speech processing, audio transcoding, and latency and memory optimization, developing processes for measuring, building, and optimizing services to maximize system performance, debugging complex system issues involving networking, scheduling, and high performance computing, rapidly customizing backend services to support customer needs, and partnering with Product to design and implement new services, features, and products end to end.
Engineering Manager, Go - Assist & Chat
Own the observability and lifecycle management of AI features across the organization. Build tools and infrastructure to enable teams to develop, monitor, and optimize LLM-powered features. Design and implement closed-loop evaluation pipelines that automatically validate prompt changes. Develop comprehensive metrics and dashboards to track LLM usage including cost per feature, token patterns, and latency. Create systems that tie user feedback to specific prompts and LLM calls. Establish best practices and processes for the full lifecycle of prompts, including development, testing, deployment, and monitoring. Collaborate with engineering teams across the organization to ensure they have the tools and visibility needed to build high-quality AI features.
Software Engineer - Human Alignment, Consumer Devices
The Software Engineer on the Human Alignment Team is responsible for building the infrastructure, data systems, and evaluation foundations for next-generation multimodal models. This includes developing pipelines that transform real-world signals into training and evaluation data, creating tooling to support human feedback, and building evaluation platforms to measure model behavior precisely. The role involves collaborating with researchers to convert behavioral questions into rigorous evaluations, datasets, rubrics, and scorecards. Responsibilities also include designing and implementing human-data pipelines, grader systems, and experiment infrastructure, creating evaluation frameworks for subjective, contextual, and long-horizon behaviors, and developing reproducible pipelines for processing multimodal signals. Additionally, the engineer must help define meaningful progress metrics and build systems to measure them confidently, work across multiple teams to ensure optimization is both technically sound and human-centered, prototype and iterate on measurement frameworks, and shape the infrastructure and methodology used in future AI product personalization, adaptation, and evaluation.
Machine Learning Engineer
As a Fullstack Engineer (Backend & Frontend) at Inflection AI, you will be responsible for owning the platforms, systems, and user-facing experiences that power conversational AI at scale. On the backend, this includes designing and implementing scalable backend systems and APIs for production LLM experiences, architecting and operating high-availability infrastructure for real-time inference and conversational pipelines, building distributed systems and asynchronous workflows, ensuring performance, reliability, and security through load testing, monitoring, and automation, and participating in on-call rotations to maintain service reliability. On the frontend, responsibilities include developing performant, accessible, and responsive web applications, building reusable UI components and design systems with frameworks such as React, TypeScript, Node.js, and Tailwind, integrating frontend with complex backend APIs and real-time data, partnering with product and design teams to prototype and iterate new AI features, and optimizing frontend performance and user experience at scale. Additionally, you will develop internal platforms to enhance engineering productivity such as CI/CD pipelines and observability frameworks, and collaborate with applied research to productionize experimental AI systems into robust features.
Backend Engineer
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate RL and post-training pipelines to jointly optimize algorithms and systems where most cost is inference. Make RL and post-training workloads more efficient with inference-aware training loops such as async RL rollouts and speculative decoding. Use these pipelines to train, evaluate, and iterate on frontier models on top of the inference stack. Co-design algorithms and infrastructure so objectives, rollout collection, and evaluation are tightly coupled to efficient inference and identify bottlenecks across training engine, inference engine, data pipeline, and user-facing layers. Run ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, and feed insights back into model, RL, and system design. Profile, debug, and optimize inference and post-training services under production workloads. Drive roadmap items requiring engine modifications including kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership by setting technical direction for cross-team efforts, mentoring engineers and researchers on full-stack ML systems work and performance engineering.
Learning Systems Engineer
The Learning Systems Engineer at OpenAI is responsible for building the infrastructure behind AI-native learning experiences, which includes creating core systems for AI education such as dynamic experiences, progress tracking, and assessments. They develop capabilities that allow learning experiences to dynamically adapt based on learners' knowledge, goals, and behaviors. The role involves building data pipelines and analytics systems to provide insights into learner outcomes, engagement patterns, and skill development. Additionally, the engineer builds systems that enable non-engineers to design, configure, and experiment with learning experiences without needing direct engineering support. The work contributes to launching new AI learning experiences, refining infrastructure for adaptive learning and assessments, validating analytics pipelines for deeper insights, and empowering education teams to use AI tools effectively at scale.
Software Engineer - Storage & Observability (Early Career)
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate RL and post-training pipelines, making workloads more efficient with inference-aware training loops and use these pipelines to train, evaluate, and iterate on frontier models. Co-design algorithms and infrastructure to tightly couple objectives, rollout collection, and evaluation to efficient inference, identifying bottlenecks across the training engine, inference engine, data pipeline, and user-facing layers. Run ablations and scale-up experiments to understand trade-offs and provide feedback into model, RL, and system design. Profile, debug, and optimize inference and post-training services under production workloads. Drive roadmap items requiring engine modification such as changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership by setting technical direction for cross-team inference, RL, and post-training efforts and mentor other engineers and researchers on full-stack ML systems and performance engineering.
AI Software Engineer (Back End)
Build and maintain back end services that handle model inference and user requests, design systems to manage requests, sessions, and streaming responses, implement reliability mechanisms such as rate limiting, retries, and graceful failure, build authentication and access controls for public usage, design systems for logging, telemetry, and evaluation signals, improve latency, throughput, and reliability of model serving, integrate new model checkpoints into the production system, and work closely with training and infrastructure engineers to deploy and operate the model. The role involves working inside production systems including logs, traces, performance profiles, and deployment pipelines to ensure the system stays up, fast, and behaves predictably under load.
Software Engineering Manager, Autonomous
As the Engineering Manager on the Autonomous team, you will lead and scale a high-caliber team of engineers focused on AI agent development and backend systems, oversee the technical roadmap for the team by translating architectural complexity into product strategies, mentor a diverse group of engineers supporting their professional growth, partner with Product and Design to ensure agent-building tools are intuitive while supporting technical capabilities, champion a culture that prioritizes rapid shipping and high standards for technical stability and user experience, and clear technical and operational roadblocks to enable the team to operate with agency and clarity.
Software Engineering Manager, Autonomous
As the Engineering Manager on the Autonomous team, you will lead and scale a high-caliber team of engineers dedicated to AI agent development and backend systems. You will oversee the technical roadmap for the team, translating architectural complexity into clear product strategies. Your role involves mentoring a diverse group of engineers, supporting their professional growth, and partnering closely with Product and Design to ensure the tools remain intuitive while supporting deep technical capabilities. You will champion a culture of shipping rapidly with a high bar for technical stability and user experience. Additionally, you will clear technical and operational roadblocks to ensure the team operates with high agency and clarity.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
