Engineering Manager, Managed AI
As an Engineering Manager on the Managed AI team at Crusoe, you will lead and scale a team of engineers building next-generation platform infrastructure for Large Language Models (LLMs). Responsibilities include guiding the team through the design and implementation of highly scalable, fault-tolerant infrastructure; leading a team of software engineers; defining and executing the AI roadmap; cultivating a high-performance engineering culture; overseeing architecture and development of core AI services such as fault-tolerant task queues and model management systems; ensuring delivery of scalable systems capable of handling millions of API requests per second; delivering an AI platform capable of handling varied AI loads from training to agentic execution infrastructure; working cross-functionally with product, infrastructure, and GTM stakeholders; representing engineering in strategic discussions; promoting knowledge sharing, mentorship, and evolving engineering processes. This role requires in-office presence in San Francisco or Sunnyvale, CA.
Staff Software Engineer, Managed AI - AI Platform
Lead the design and implementation of core AI services including resilient fault-tolerant queues for efficient task distribution, model catalogs for managing and versioning AI models, and scheduling mechanisms optimized for cost and performance. Architect and scale infrastructure to handle millions of API requests per second and implement robust monitoring and alerting to ensure system health and 24/7 availability. Collaborate closely with product management, business strategy, and other engineering teams to define the AI platform roadmap, influence the long-term vision and architectural decisions of the platform, contribute to open-source AI frameworks, actively participate in the AI community, and prototype and rapidly iterate on emerging technologies and new features.
Staff Engineer, API Core Platform
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines such as SGLang- or vLLM-style systems and Together’s inference stack, including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Unify inference with RL/post-training by designing and operating RL and post-training pipelines (e.g., RLHF, RLAIF, GRPO, DPO-style methods, reward modeling), optimizing algorithms and systems where over 90% of the cost is inference. Make RL and post-training workloads more efficient with inference-aware training loops, including async RL rollouts and speculative decoding techniques, to reduce rollout collection and evaluation costs. Use these pipelines to train, evaluate, and iterate on frontier models on top of the inference stack. Co-design algorithms and infrastructure to tightly couple objectives, rollout collection, and evaluation with efficient inference and quickly identify bottlenecks across training engines, inference engines, data pipelines, and user-facing layers. Run ablation and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights back into model, RL, and system design. Own critical systems at production scale by profiling, debugging, and optimizing inference and post-training services under real production workloads. Drive roadmap items requiring real engine modifications such as changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership at the Staff level by setting technical direction for cross-team efforts intersecting inference, RL, and post-training, and mentor engineers and researchers on full-stack ML systems work and performance engineering.
AI Deployment Engineer | Codex
Serve as the primary technical subject matter expert on OpenAI Codex for a portfolio of customers, embedding deeply with them to enable their engineering teams and build coding workflows. Partner directly with customers to design and implement AI-enhanced development workflows from rapid prototyping through scalable production rollout. Build high-quality demos, reference implementations, and workflow automations using Codex itself as part of the development process. Lead large-format workshops, technical deep dives, and hands-on enablement sessions to help engineering organizations adopt AI coding tools effectively and safely. Contribute technical content including examples, guides, patterns, and best practices to the OpenAI Cookbook to help the broader developer community. Gather high-fidelity product insights from real customer deployments and translate them into product proposals and model feedback for internal teams. Influence customer strategy and decision-making by framing how AI coding tools fit into their SDLC, technical roadmap, and organizational workflows. Serve as a trusted advisor on solution architecture, operational readiness, model configuration, security considerations, and best-practice adoption.
Prospera AI - AI Backend Engineer
Own and evolve the LLM orchestration pipeline by designing and optimizing the multi-agent orchestration system, implementing parallelization and streaming to reduce response latency, and building robust prompt management with versioning and A/B testing capabilities. Design retrieval-augmented generation (RAG) systems for accurate, contextual responses by working with vector databases, embeddings, and relevance scoring while optimizing for speed and accuracy at scale. Develop production APIs that connect AI capabilities to the frontend, including designing for future integrations with CRMs and advisor tools, implementing authentication, rate limiting, and documentation. Establish code review practices and testing standards, document architecture decisions for future team members, and contribute to technical patents and IP development.
RISC-V AI / HPC & Agentic Software Engineering Lead
As an Automotive and Robotics SoC Architect at Tenstorrent, you will define scalable, top-down system architectures that unify CPU and AI technologies for next-generation automotive applications. You will shape the architectural direction of the automotive and robotics portfolio to ensure products meet high standards of performance, safety, reliability, and security. This role requires strong technical leadership, systems thinking, and cross-functional collaboration, working in a highly technical, fast-moving environment with engineering teams and external partners.
Partner AI Deployment Engineer
The Partner AI Deployment Engineer (P-ADE) is responsible for leading technical delivery with OpenAI partners across EMEA, supporting customer deployments across various industries and use cases. This includes acting as the primary technical delivery partner, translating solution designs into deployable, production-ready architectures on the OpenAI platform, and supporting customer time to value through prototyping, integration support, architectural guidance, and troubleshooting during delivery phases. The role requires close collaboration with partner delivery teams, Solutions Engineers, Forward Deployed Engineers, and other ADEs to ensure appropriate technical expertise from design through production rollout. They help partners operationalize solutions addressing scalability, reliability, security, and safety in enterprise production environments, contribute to reusable deployment patterns and delivery guidance, ensure solutions meet OpenAI’s standards before and after go-live, and capture and synthesize feedback from deployments to share insights for improving playbooks and platform capabilities.
Senior Applications Engineer
Lead the end-to-end design and delivery of scalable, secure, and intelligent enterprise systems supporting HackerOne's transformation into an AI-first organization. Develop, configure, and maintain Salesforce Sales Cloud, Workday, and other Enterprise, GTM, and Security applications to enhance business efficiency and scalability. Collaborate with business and engineering teams to identify opportunities for automation, integration, and system modernization. Oversee the architecture and execution of platform-level capabilities using AI and modern tools to reduce manual work, improve decision-making, and increase system resilience. Provide technical leadership to internal engineers and external partners, ensuring design quality, operational excellence, and maintainability. Contribute to incident and on-call response strategies, focusing on building resilient systems. Mentor engineers and promote a culture of innovation and continuous improvement. Champion change management to ensure systems are adopted, understood, and evolved.
Product Engineer
Translate research into product by working with client-side researchers on post-training, evaluations, safety, and alignment to build the necessary primitives, data, and tooling. Partner closely with core customers and frontier research labs to tackle technical challenges related to model improvement, performance, and deployment. Shape and propose model improvement work by translating customer and research objectives into technically rigorous proposals and execution plans. Lead the end-to-end lifecycle of projects including discovery, writing PRDs and technical specs, prioritizing trade-offs, running experiments, shipping solutions, and scaling successful pilots. Lead high-stakes engagements with senior stakeholders, define success metrics, identify risks, and drive programs to measurable outcomes. Collaborate across teams including research, platform, operations, security, and finance to deliver production-grade results. Design and implement robust evaluation frameworks, ensure data quality and feedback loops, and share learnings to elevate technical execution across accounts.
AI Solutions Manager (San Francisco)
The AI Solutions Manager will build new LLM and instrumentation libraries for emerging LLM providers and agent frameworks. They will maintain and enhance existing instrumentation across Python and TypeScript ecosystems, including tools like OpenAI, Anthropic, LlamaIndex, CrewAI, and others. The role involves driving improvements to semantic conventions and OpenTelemetry standards that define AI observability. The manager will collaborate with the global developer community through GitHub, Slack, and conferences, as well as with Arize product managers and solution architects. They will take complex problems from ideation to completion with full ownership and accountability.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
