Applied AI, Forward Deployed Machine Learning Engineer, Critical and Sovereign Institutions, EMEA
The Applied AI Engineer is responsible for the technical design, implementation, and deployment of AI solutions tailored to the needs of critical infrastructure and sovereign institutions. Responsibilities include individually deploying AI solutions into production for use cases with significant operational and strategic impact, developing state-of-the-art GenAI applications specific to sovereign institutions and critical infrastructure, collaborating closely with researchers, AI engineers, and product teams on complex projects involving advanced fine-tuning, LLM applications, and contributions to open-source codebases. The role also involves participating in pre-sales discussions to understand client needs and provide technical guidance on Mistral's products, and working with product and science teams to improve offerings with a focus on security, compliance, and performance.
AI Evaluation Engineer
Design and implement evaluation frameworks that enable Evaluation-Driven Development for AI systems deployed in customer environments. Define how system quality is measured in each domain, ensuring that evaluation signals reflect real user needs, domain constraints, and business objectives. Build and maintain golden test cases and regression suites in Python, using both human-authored and AI-assisted test generation to capture critical behaviors and edge cases. Develop and maintain evaluation pipelines—offline and online—that integrate directly into system iteration loops, where evaluation results inform prompt design, agent logic, model selection, and release readiness. Define, calibrate, and operate LLM-based graders, aligning automated judgments with expert human assessments, investigate where evaluation signals diverge from real-world outcomes, and refine grading approaches to maintain signal quality as systems and domains evolve. Work closely with Forward Deployed AI Engineers, Architects, Product Engineers, AI Strategists, and domain experts to ensure evaluation frameworks meaningfully guide system development and deployment in production.
AI Factory, Value Engineer
Responsibilities include translating business requirements into requirements for AI/ML models, preparing data to train and evaluate AI/ML/DL models, building AI/ML/DL models using state-of-the-art algorithms especially transformers, testing and evaluating models, benchmarking quality, publishing models and datasets, deploying models in production by containerizing them, working with customers and internal employees to refine model quality, establishing continuous learning pipelines with online or transfer learning, and building and deploying containerized applications on cloud or on-premise environments.
Agentic Systems Engineer
Build agents as modular, plug-and-play components that slot cleanly into the wider stack. Add memory layers (short-term, long-term, summarization, retrieval-backed) into running systems. Wire up tool integrations, MCP servers, and skills. Own the quality of the features you put out, including tests, evals, and observability. Analyze production traces to understand system behavior and implement fixes accordingly.
Sr. Applied AI Engineer
Build and evolve shared AI Platform capabilities that serve as the foundation for teams building with machine learning and generative AI across Zapier. Work mostly in TypeScript and Python, focusing on improving LLM Ops and ML Ops capabilities including observability, monitoring, evaluation, deployment workflows, and operational guardrails. Design and implement systems to measure and improve performance, reliability, safety, and cost efficiency of AI-powered experiences. Identify tooling gaps proactively and work across teams to standardize best practices for building, deploying, and monitoring AI-driven experiences. Collaborate closely with engineers across product, infrastructure, and data teams to ensure AI components are reusable, well-documented, and easy to adopt company-wide. Evaluate emerging tools, models, and patterns in the AI ecosystem and help determine which ones should be incorporated into Zapier’s shared platform.
Research Engineer
As a Research Engineer at Tandem, you will work on real production use cases of large language models (LLMs) and other machine learning (ML) techniques to solve business problems and create AI applications. You are responsible for developing a deep understanding of the product and its business drivers to operate and drive impact both cross-functionally and independently. Responsibilities include end-to-end project ownership from definition, design, development, launch, to ensuring impact on user growth, operational efficiency, or revenue. You will scope and lead AI augmentation and automation projects involving classifications, data extraction, content generation, search, question answering, process prediction, and LLM-powered bots. You are expected to stay updated on emerging AI methods, guide model and technique adoption decisions, and establish research strategies with rigorous experimentation and evaluation that consider accuracy, consistency, interpretability, and real-world impact. Development of novel algorithms for natural language processing, data extraction, and autonomous reasoning is required, alongside active client engagements to understand requirements and deliver solutions. Collaboration with the team and CEO on business decisions balancing growth speed and profitability is also part of the role. Specific tasks include automating complex workflows for insurance and affordability programs, scaling across drug classes and markets by making data pipelines robust and building AI workflows for patients and providers, and assisting biopharma partners through data translation, predictive modeling, and enrollment in studies and programs.
AI Deployment Engineer | Codex
Serve as the primary technical subject matter expert on OpenAI Codex for a portfolio of customers, embedding deeply with them to enable their engineering teams and build coding workflows. Partner directly with customers to design and implement AI-enhanced development workflows, from rapid prototyping through scalable production rollout. Build high-quality demos, reference implementations, and workflow automations, using Codex itself as part of the development process. Lead large-format workshops, technical deep dives, and hands-on enablement sessions that help engineering organizations adopt AI coding tools effectively and safely. Contribute technical content including examples, guides, patterns, and best practices to the OpenAI Cookbook to help the broader developer community accelerate their work with Codex. Gather high-fidelity product insights from real customer deployments and translate them into clear product proposals and model feedback for internal teams. Influence customer strategy and decision-making by framing how AI coding tools fit into their software development lifecycle, technical roadmaps, and organizational workflows. Serve as a trusted advisor on solution architecture, operational readiness, model configuration, security considerations, and best-practice adoption.
Member of Technical Staff (Applied AI Engineer)
The role involves working on custom memory systems that grow and scale as users use the platform, developing a custom cutting-edge agent, managing bare metal infrastructure and scalability with concurrency and high reliability, optimizing cost and output with multiple models, and evaluating large language model (LLM) performance across a wide domain of tasks.
Forward Deployed Engineer - Singapore
Forward Deployed Engineers lead complex end-to-end deployments of frontier models in production alongside strategic customers, owning discovery, technical scoping, system design, build, and production rollout. They operate across multiple deployments from prototype to stable production, build full-stack systems to deliver customer value, embed closely with customer teams to understand needs, guide adoption, scope work, sequence delivery, remove blockers, make trade-offs between scope, speed, and quality, contribute directly in code when necessary, codify working patterns into reusable tools and playbooks, share field feedback to help Research and Product teams improve models, and keep teams moving with clarity and follow-through.
Applied AI Inference Engineer
Develop and maintain software systems and product features using one or more general-purpose programming languages in a production-level environment, with a preference for Python due to its relevance in ML projects. Drive customer impact by designing, implementing, and deploying Baseten solutions end-to-end, working with customers’ engineering teams at every stage of the customer journey including sales, implementation, and expansion. Deliver with velocity by turning vague objectives into clear specs and well-defined PoCs to rapidly ship well-tested services and outcomes for customers. Optimize and enhance AI/ML projects, contributing to continuous improvement of the technical stack, including developing features and PRDs with other engineering and product organizations. Own products and customer projects end-to-end, functioning as an engineer, project manager, and product manager with focus on user empathy, project specification, and execution. Navigate ambiguity and exercise good judgment on tradeoffs and tools needed to solve problems while avoiding unnecessary complexity. Demonstrate pride, ownership, and accountability for your work.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
