Staff Field Application Engineer, Customer Success
Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, place and route (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and ML-based techniques, collaborating closely with verification, extraction, timing, design for test (DFT), and EDA vendors.
Member of Engineering (Reinforcement Learning Infrastructure)
Keep up with the latest research, and be familiar with the state of the art in LLMs, RL, and code generation. Develop methods for tuning training and inference end-to-end for high throughput. Design data control systems in an RL pipeline that govern what the model sees and when. Debug cases where infrastructure decisions are silently degrading learning dynamics. Build observability tooling that surfaces when a system-level issue is the root cause of a training regression. Help build robust, flexible and scalable RL pipelines. Optimize performance across the stack — networking, memory, compute scheduling, and I/O. Write high-quality, pragmatic code. Work in the team: plan future steps, discuss, and always stay in touch.
Member of Engineering (Reinforcement Learning)
Research and experiment on ways to improve reasoning and code generation for LLMs. Own the full experiment life cycle from idea to experimentation and integration. Keep up with the latest research, and be familiar with the state of the art in LLMs, RL, and code generation. Translate research ideas into clean, reusable codebases that other researchers can build on. Design, analyze, and iterate on data generation and training of LLMs. Implement and iterate on RL training pipelines that scale reliably across domains. Diagnose training instabilities and failures, debug RL runs and propose mitigation methods. Write high-quality, reproducible and maintainable code.
Research Infrastructure Engineer, Training Systems
Build and maintain infrastructure for large-scale model training and experimentation. Design APIs and interfaces to simplify complex training workflows and prevent misuse. Improve reliability, debuggability, and performance of training and data pipelines. Debug issues across technologies including Python, PyTorch, distributed systems, GPUs, networking, and storage. Write tests, benchmarks, and diagnostics to detect significant regressions.
Software Engineer, Model Serving Infrastructure
The role involves contributing to the development of next-generation, high-performance machine learning serving systems. Responsibilities include building infrastructure that powers AI applications, working on problems at the intersection of distributed systems, machine learning, and high-performance computing, and solving fundamental computer science problems impacting AI deployment. Specific projects include implementing asynchronous inference for non-blocking client requests, designing intelligent request routing systems to balance load across thousands of model replicas with strict latency SLAs, building traffic management systems for zero-downtime model updates handling terabytes of inference requests, improving state management for scale from thousands to tens of thousands of replicas, architecting frameworks for multi-model orchestration in complex ML pipelines ensuring end-to-end latency guarantees, and developing observability and debugging tools for distributed ML applications at scale. The work involves writing performance-critical code in Python (with Cython optimizations) and potentially C++, working with distributed systems at scale using Ray Core's actor system, gRPC, and custom networking protocols, extending cloud-native infrastructure such as Kubernetes and service meshes, gaining system-level knowledge of ML/AI frameworks like TensorFlow, PyTorch, JAX, and transformers, and ensuring production reliability with tools like OpenTelemetry, Prometheus, distributed tracing, and chaos engineering to maintain 99.99% uptime. The role also involves leveraging AI coding agents to enhance team productivity while maintaining high code quality standards.
Engineering Manager, Cooperative Systems
Lead and grow a small team building applied AI systems for internal operations. Design and build AI-powered automation systems in close proximity to customers. Stay hands-on in architecture and implementation across the full stack. Develop evolving systems spanning developer tools, automation platforms, knowledge graphs, and data systems. Deploy systems directly to internal users and close customers to iterate rapidly based on real-world feedback. Engage frequently with scaled workforces to understand needs and validate solutions. Create systems for visibility and learning in hybrid workforces. Partner with product, research, and ops teams daily.
AI/ML Engineer
Develop, train, and optimize machine learning models for various mobile app features. Research and implement state-of-the-art AI techniques to improve user engagement and app performance. Collaborate with cross-functional teams to integrate AI-driven solutions into applications. Design and maintain scalable ML pipelines, ensuring efficient model deployment and monitoring. Analyze large datasets to derive insights and drive data-driven decision-making. Stay updated with the latest AI trends and best practices, incorporating them into development processes. Optimize AI models for mobile environments to ensure high performance and low latency.
Software Engineer, Early Career
As a Software Engineer at Mirage, you will work across product engineering, backend/platform engineering, and applied AI teams. Responsibilities include designing and building systems, APIs, and infrastructure that power products; solving challenges involving distributed systems, scaling, and performance; integrating and operating large AI models in production; building core platform components such as storage, billing, observability, and security; shipping end-to-end product experiences for creative workflows; building polished, performant user interfaces (web or native mobile); pushing the boundaries of video, graphics, and AI-powered creation tools; instrumenting, A/B testing, and iterating quickly with real user data; building and shipping AI-powered product experiences end-to-end; working with state-of-the-art models across video, audio, image, and text; designing systems for context, reasoning, and intelligent behavior; and building evals, datasets, and tooling for improving model quality.
Staff Software Engineer, AI Platform
Design and build abstractions and platform-level systems that improve all of Harvey's agentic products. Own infrastructure for model integration, routing, and evaluation that helps Harvey choose and deploy the right foundation model for any given context. Build evaluation frameworks and tooling that let every team across Harvey iterate on AI quality effectively. Partner closely with product engineering teams, PMs, and design to launch cutting-edge AI products. Evaluate, prototype, and integrate the latest advancements in AI and agentic systems as they emerge.
Machine Learning Research, RF Foundation Models Specialist
Formulate new machine learning problems in RF sensing and spectrum understanding. Design experiments and evaluation approaches reflecting real operating conditions such as domain shift, changing interference, and varying sensors and platforms. Build models for structured, noisy, and partially observed signal environments. Improve robustness across propagation, interference, and low-visibility waveform conditions. Optimize models for throughput, latency, and deployment constraints. Move promising research into a release path for real systems through proofs-of-concept, realistic validation, and conversion into maintainable, deployable code. Use field performance to inform the development of the next generation of models and tooling. Work across the lifecycle of research and deployment including data and evaluation design, experimentation, model development, release readiness, and iteration based on real-world outcomes. Collaborate closely with embedded, hardware, and mission teammates, influencing how machine learning capability is built as the company scales.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Need help with something? Here are our most frequently asked questions.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
