AI Inference Engineer - Model Optimization & Deployment
As a Model Optimization & Deployment Engineer, you will optimize large-scale models (LLMs, VLMs) using advanced quantization techniques such as PTQ, QAT, mixed-precision inference workflows, and parameter-efficient fine-tuning methods like LoRA and QLoRA. You will architect and implement model conversion and compilation pipelines using TensorRT and TensorRT-LLM for deployment on edge devices. The role involves performing rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries. You are responsible for writing and optimizing custom CUDA kernels and TensorRT Plugins to maximize memory bandwidth and minimize latency on AI accelerators. Furthermore, you will write production-level, highly concurrent, memory-safe C++ and Python code to ensure real-time, deterministic execution of inference on vehicle System on Chips (SOCs).
Deployment Lead
The Deployment Lead will work closely with Simulation Engineers, Machine Learning Engineers, and customers to understand and define engineering and physics challenges, providing technical leadership to their team. Responsibilities include leading pre-processing and analysis of complex data for predictive modelling, establishing best practices, architecting and developing innovative deep learning models combined with optimisation methods, taking responsibility for the quality and impact of their work and their team's work, designing and testing robust, scalable data pipelines for production environments, leading cross-functional collaboration for seamless integration of data science models with simulations, driving internal R&D and product development to refine models and identify new applications, mentoring junior team members, leading communication and presentations to technical teams and customers, onboarding users and co-developing solutions with customers. The role involves representing the company as a technical authority at customer sites internationally, collaborating on-site to build solutions, influencing technical direction, shaping future solutions and products, and developing leadership skills.
Senior Platform/DevOps Engineer (Kubernetes-Linux)
Translate business requirements into requirements for AI/ML models; prepare data to train and evaluate AI/ML/DL models; build AI/ML/DL models by applying state-of-the-art algorithms, especially transformers; leverage existing algorithms from academic or industrial research when applicable; test, evaluate, and benchmark AI/ML/DL models and publish the models, data sets, and evaluations; deploy models in production by containerizing them; work with customers and internal employees to refine model quality; establish continuous learning pipelines for models using online or transfer learning; build and deploy containerized applications on cloud or on-premise environments.
Machine Learning PhDs - AI Trainer
Use machine learning expertise to create domain-relevant questions and review AI-generated responses for accuracy, rigor, and relevance to real-world physics research and practice.
Real Estate, Workplace Programs and User Experience Lead
Lead the research and development of novel deep learning algorithms that enable robots to perform complex, contact-rich manipulation tasks. Explore the intersection of computer vision and robotic control, designing systems that allow robots to perceive and interact with objects in dynamic environments. Create models that integrate visual data to guide physical manipulation, enabling sophisticated handling of diverse items. Collaborate with a multidisciplinary team of engineers and researchers to translate concepts into robust capabilities deployable on physical hardware for industrial applications. Research and develop deep learning architectures for visual perception and sensorimotor control in contact-rich scenarios. Design algorithms for high precision manipulation of complex or deformable objects. Collaborate with software engineers to optimize and deploy research prototypes on robotic hardware. Evaluate model performance in simulation and real-world environments to ensure robustness and reliability. Identify opportunities to apply advancements in computer vision and robot learning to practical industrial problems. Mentor junior researchers and contribute to the technical direction of the manipulation research roadmap.
Forward Deployed AI Engineer
Drive the end-to-end technical deployment of Latent Labs models into customer environments, ensuring seamless integration with existing scientific and IT infrastructure. Design and build production-grade API integrations, data pipelines and model-serving infrastructure tailored to each customer’s requirements. Work on-site or embedded with pharma and biotech partners to scope technical requirements, troubleshoot issues and deliver solutions. Ensure deployments meet enterprise standards for security, performance and reliability. Serve as the technical point of contact for assigned customers, building trusted relationships with their scientific and engineering teams, including spending time working on-site at international partner locations as needed. Gather and synthesise customer feedback, translating it into actionable insights for product, research and platform teams. Collaborate with internal teams to shape the product roadmap based on real-world deployment learnings. Create technical documentation, integration guides and best-practice resources for customers. Stay on top of the latest developments in ML infrastructure, model serving and cloud-native tooling. Gain a strong working understanding of protein and cell biology as it relates to the product. Participate in knowledge sharing, including organizing and presenting at internal reading groups.
Senior Machine Learning Scientist
The Senior Machine Learning Scientist will train, evaluate, and iterate on ML models and agentic systems for customer feedback, including owning custom fine-tuning pipelines. They will run experiments end-to-end, track results rigorously, and make recommendations on what to ship, iterate, or retire. The role involves building and maintaining LLM-powered features such as retrieval pipelines, reranking systems, insight agents, data mining agents, and automated taxonomy generation. The scientist will design and run robust evaluation frameworks including building test sets, defining metrics, evaluating non-deterministic systems, handling class imbalance, and automating checkpoint comparisons. They will improve and extend semantic search and retrieval methods, write production-quality code, and collaborate closely with Engineering on productionisation, model serving, data pipelines, and monitoring. The role includes working with Product and Commercial teams to translate business needs into practical ML solutions and supporting client evaluations and accuracy benchmarking. Additionally, the scientist will mentor team members, review code and research, and integrate relevant advances from literature into the product.
Director, Data Center Operations
The responsibilities include advancing inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implementing and maintaining changes in high-performance inference engines, including kernel backends and speculative decoding, profiling and optimizing performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Unifying inference with RL/post-training by designing and operating RL and post-training pipelines, making RL and post-training workloads more efficient with inference-aware training loops, and using these pipelines to train, evaluate, and iterate on frontier models. Co-designing algorithms and infrastructure so that objectives, rollout collection, and evaluation are tightly coupled to efficient inference, identifying bottlenecks across the training engine, inference engine, data pipeline, and user-facing layers. Running ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, and feeding these insights back into model, RL, and system design. Owning critical systems at production scale by profiling, debugging, and optimizing inference and post-training services under real production workloads, driving roadmap items requiring engine modification, and establishing metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Providing technical leadership by setting technical direction for cross-team efforts at the intersection of inference, RL, and post-training, and mentoring other engineers and researchers on full-stack ML systems work and performance engineering.
Regional Sales Lead, Singapore
Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, placement and routing (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and machine learning-based techniques, working closely with internal teams such as verification, extraction, timing, Design for Test (DFT), and electronic design automation (EDA) vendors.
Head of ISV Partnerships, Experience GTM
Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, placement and routing (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production physical design flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and ML-based techniques, collaborating closely with verification, extraction, timing, design-for-test (DFT), and EDA vendor teams.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Need help with something? Here are our most frequently asked questions.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
