Regional Sales Lead, Singapore
Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, placement and routing (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and machine learning-based techniques, working closely with internal teams such as verification, extraction, timing, Design for Test (DFT), and electronic design automation (EDA) vendors.
Customer Support Engineer (GPU Cluster)
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines such as SGLang- or vLLM-style systems including kernel backends, speculative decoding, quantization, etc. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate reinforcement learning (RL) and post-training pipelines like RLHF, RLAIF, GRPO, DPO-style methods, and reward modeling where most of the cost is inference, optimizing algorithms and systems jointly. Make RL and post-training workloads more efficient with inference-aware training loops and use these pipelines to train, evaluate, and iterate on frontier models on the inference stack. Co-design algorithms and infrastructure to couple objectives, rollout collection, and evaluation tightly with efficient inference and identify bottlenecks across training, inference, data pipelines, and user-facing layers. Run ablation and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights back into model, RL, and system design. Own critical production-scale systems by profiling, debugging, and optimizing inference and post-training services under real workloads, driving roadmap items that require engine modification such as changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership by setting technical direction for cross-team efforts at the inference, RL, and post-training intersection and mentoring other engineers and researchers in full-stack ML systems work and performance engineering.
AI/ML Physical Design Flow Engineer
Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, place and route (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production physical design flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and machine learning-based techniques, collaborating closely with verification, extraction, timing, design for test (DFT), and electronic design automation (EDA) vendors.
TL, Research Inference
Design and build high-performance inference runtimes for large-scale AI models focusing on efficiency, reliability, and scalability. Own and optimize core execution paths including model execution, memory management, batching, and scheduling. Develop and improve distributed inference across multiple GPUs with attention to parallelism strategies, communication patterns, and runtime coordination. Implement and optimize inference-critical operators and kernels based on real-world workloads. Partner with research teams to ensure new model architectures are supported accurately and efficiently in inference systems. Diagnose and resolve performance bottlenecks using profiling, benchmarking, and low-level debugging. Contribute to observability, correctness, and reliability of large-scale AI systems.
Inference Technical Lead, On-Device Transformers
As a Technical Lead on the Future of Computing Research team, you will evaluate and select silicon platforms such as GPUs, NPUs, and specialized accelerators for on-device and edge deployment of OpenAI models. You will work closely with research teams to co-design model architectures that meet real-world deployment constraints including latency, memory, power, and bandwidth. You will analyze and model system performance, identifying tradeoffs between model design, memory hierarchy, compute throughput, and hardware capabilities. You will partner with hardware vendors and internal infrastructure teams to bring up new accelerators and ensure efficient execution of transformer workloads. Additionally, you will build and lead a team of engineers responsible for implementing the low-level inference stack, including kernel development and runtime systems. You will also take nascent research capabilities and develop them into usable capabilities.
AI & IT Systems Engineer
As Jasper undergoes an agentic AI shift, the AI & IT Systems Engineer role involves ensuring the IT infrastructure is robust, secure, and fine-tuned for advanced AI workflows, spending 70-80% of time on AI enablement deployments. Responsibilities include modernizing and improving IT systems to support autonomous AI workflows, building scalable automation infrastructure to enhance efficiency and reduce manual tasks, and operationalizing AI initiatives using tools like Claude, ChatGPT, and Zapier to create intelligent, cross-platform workflows involving platforms like Google Workspace and Slack. The role also requires managing core IT systems such as Identity Providers and Mobile Device Management, streamlining identity and access operations using features like Okta Workflows, and providing cross-functional technical support across departments to implement AI enablement projects. Additionally, the engineer manages a broad SaaS ecosystem, including Google Workspace and Linear, and assists in developing training resources and playbooks to facilitate team adoption of new AI tools.
Customer Support Engineer (Inference), India
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines, including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate RL and post-training pipelines, jointly optimizing algorithms and systems to make inference and post-training workloads more efficient. Train, evaluate, and iterate on frontier models using these pipelines. Co-design algorithms and infrastructure for tightly coupled objectives, rollout collection, and evaluation to efficient inference. Identify bottlenecks across training engine, inference engine, data pipeline, and user-facing layers. Run ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights back into model, RL, and system design. Profile, debug, and optimize inference and post-training services under real production workloads. Drive roadmap items requiring engine modification such as changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to rigorously validate improvements. Provide technical leadership by setting technical direction for cross-team efforts at the intersection of inference, RL, and post-training, and mentoring other engineers and researchers on full-stack ML systems work and performance engineering.
Helix AI Engineer, Agentic Systems
Design, deploy, and maintain Figure's training clusters. Architect and maintain scalable deep learning frameworks for training on massive robot datasets. Work together with AI researchers to implement training of new model architectures at a large scale. Implement distributed training and parallelization strategies to reduce model development cycles. Implement tooling for data processing, model experimentation, and continuous integration.
Manual Quality Assurance Engineer, Web Core Product
Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for diverse use cases. Deploy and operate the core ML inference workloads for the AI Voices serving pipeline. Introduce new techniques, tools, and architecture that improve performance, latency, throughput, and efficiency of deployed models. Build tools to identify bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.
AI Infrastructure Engineer
Operate and maintain a large-scale GPU cluster consisting of thousands of GPUs across multiple data centers using Kubernetes and Slurm. Monitor and diagnose failures across the GPU hardware and software stacks to ensure high availability and rapid recovery. Develop automation tools and scripts using Python or Shell to streamline repetitive infrastructure management tasks and improve operational efficiency. Manage GPU resource quotas and provide technical support to ML researchers to ensure optimal utilization of computing resources. Participate in the architectural design and performance tuning of distributed training environments for large-scale autonomous driving models.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
