Machine Learning Engineer
Design, build, and maintain scalable machine learning systems including data ingestion, preprocessing, training, testing, and deployment. Develop and optimize end-to-end ML pipelines encompassing data collection, labeling, training, validation, and monitoring to ensure reliability and reproducibility. Implement robust MLOps practices such as model versioning, experiment tracking, CI/CD for machine learning, and continuous monitoring in production environments. Collaborate with product and engineering teams to integrate and deploy models into real-time products with a focus on efficiency and scalability. Ensure data quality, observability, and performance across all AI systems. Stay current with the latest AI infrastructure, tooling, and research to support ongoing innovation.
Engineering Manager, AI & Data Infrastructure
The Engineering Manager, AI & Data Infrastructure leads the AI & Data Infrastructure team responsible for the data and inference systems that support agent interactions, including streaming and batch pipelines for analytics and customer telemetry, realtime databases for low-latency behavior, and GPU and model-serving platforms for LLM inference. This role involves building, leading, and developing a high-performing team of data and ML infrastructure engineers through hiring, coaching, and performance management. Responsibilities include owning the technical strategy and roadmap for AI & Data Infrastructure, staying hands-on with design and code reviews, leading architecture for high-throughput data systems and low-latency inference, setting reliability, quality, and cost standards, investing in developer and analyst experience, raising standards on AI-assisted engineering practices, and partnering with Research, Product Engineering, Platform, and customer-facing teams to deliver data and inference capabilities, including enterprise deployments.
Machine Learning Engineer, API Multicloud
The role involves partnering with strategic customers and internal teams to define target model behaviors, diagnose failure modes, and translate real-world needs into training, evaluation, and system requirements. The engineer will build and scale production machine learning systems for model customization, post-training, and fine-tuning-as-a-service workflows. Responsibilities include investigating whether training and customization workflows produce the intended outcomes and identifying necessary changes to data, evaluation, training, or infrastructure to improve performance. The engineer will collaborate with backend and infrastructure engineers to integrate ML capabilities into AWS-native API environments and feed learnings from partner deployments back into the platform by proposing and implementing improvements to post-training systems, tooling, APIs, and developer workflows. The role requires close work with Research and Applied teams to bring model improvements, training workflows, and evaluation best practices into production. Designing systems that allow strategic partners and enterprise customers to safely customize OpenAI models for high-value use cases is also a key responsibility. Additionally, the role involves debugging and improving complex systems spanning model behavior, training data, APIs, distributed infrastructure, and customer-facing product surfaces. The engineer must operate with high ownership in a 0 to 1 environment where requirements are ambiguous, systems are evolving quickly, and reliability matters.
AI/ML Engineer, Madrid
Develop, train, and optimize machine learning models for various mobile app features. Research and implement state-of-the-art AI techniques to improve user engagement and app performance. Collaborate with cross-functional teams to integrate AI-driven solutions into applications. Design and maintain scalable ML pipelines, ensuring efficient model deployment and monitoring. Analyze large datasets to derive insights and drive data-driven decision-making. Stay updated with the latest AI trends and best practices, incorporating them into development processes. Optimize AI models for mobile environments to ensure high performance and low latency.
IT Engineer
Collaborate directly with the GTM team including Account Executives and Solutions Architects to ensure smooth integration and successful deployment of machine learning solutions. Build and present compelling demonstrations and proof of concepts that showcase AI technology capabilities. Design, develop, and deploy end-to-end AI-powered applications tailored to customer needs. Contribute to the internal machine learning platform by adding features and fixing bugs. Integrate and enable new machine learning models into the existing platform or client environments. Improve system performance, efficiency, and scalability of deployed models and applications. Work closely with partners to enable joint AI solutions and ensure seamless collaboration.
Freelance n8n Workflow Developer - AI Trainer
Design, build, and evaluate advanced workflows in self-hosted n8n environments. Architect multi-system integrations for scalable automation pipelines. Develop and optimize AI-powered workflows such as content generation, automation pipelines, and enrichment systems. Build and maintain lead generation, outreach, and data processing automation systems. Implement web scraping workflows and ensure reliable data extraction and processing. Optimize workflow execution, node sequencing, and error handling to prevent failures, delays, and API timeouts.
Senior Machine Learning Engineer
As a Senior Machine Learning Engineer, responsibilities include leading technical scoping and architectural decisions for high-impact machine learning systems, designing and building production-grade ML software, tools, and scalable infrastructure, defining and implementing best practices and standards for deploying machine learning at scale across the business, collaborating with engineers, data scientists, product managers, and commercial teams to solve critical client challenges and leverage opportunities, acting as a trusted technical advisor to customers and partners by translating complex concepts into actionable strategies, and mentoring and developing junior engineers while actively shaping the team's engineering culture and technical depth.
Director, Data Center Operations
The responsibilities include advancing inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implementing and maintaining changes in high-performance inference engines, including kernel backends and speculative decoding, profiling and optimizing performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Unifying inference with RL/post-training by designing and operating RL and post-training pipelines, making RL and post-training workloads more efficient with inference-aware training loops, and using these pipelines to train, evaluate, and iterate on frontier models. Co-designing algorithms and infrastructure so that objectives, rollout collection, and evaluation are tightly coupled to efficient inference, identifying bottlenecks across the training engine, inference engine, data pipeline, and user-facing layers. Running ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, and feeding these insights back into model, RL, and system design. Owning critical systems at production scale by profiling, debugging, and optimizing inference and post-training services under real production workloads, driving roadmap items requiring engine modification, and establishing metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Providing technical leadership by setting technical direction for cross-team efforts at the intersection of inference, RL, and post-training, and mentoring other engineers and researchers on full-stack ML systems work and performance engineering.
Staff Analytics Engineer — Data Warehouse
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines, including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate RL and post-training pipelines where most cost is inference, jointly optimizing algorithms and systems. Make RL and post-training workloads more efficient with inference-aware training loops, async RL rollouts, and speculative decoding. Use these pipelines to train, evaluate, and iterate on frontier models. Co-design algorithms and infrastructure to tightly couple objectives, rollout collection, and evaluation with efficient inference, identifying bottlenecks across the training engine, inference engine, data pipeline, and user-facing layers. Run ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights into model, RL, and system design. Profile, debug, and optimize inference and post-training services under real production workloads. Drive roadmap items requiring engine modification such as changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership to set direction for cross-team efforts in inference, RL, and post-training and mentor engineers and researchers on full-stack ML systems work and performance engineering.
US Sales and Partnerships Lead, Digital Diagnostics
Lead the team responsible for the AI/ML Stack infrastructure that bridges ML research and production, evolving the stack to meet large scale ML training and inference workload needs. Develop and execute a long-term vision and roadmap for the MLOps team to support ML development and deployment needs across business units, managing short-term deliveries and long-term architectural transformation. Lead and mentor a team of 6-7+ engineers, strategically allocate resources for support and strategic initiatives. Collaborate cross-functionally with leaders in machine learning, data science, product engineering, and infrastructure to identify pain points, address bottlenecks, and facilitate deployment of new solutions. Architect compute and storage pipelines to manage millions of slides and complex artifacts without data fragmentation or latency. Modernize AI product inference stack to support substantial growth in AI runs globally. Work with Site Reliability Engineering to establish comprehensive system observability metrics including compute utilization, network bottlenecks, and cost attribution. Conduct build versus buy assessments and lead stack refresh audits to benchmark proprietary tools against commercial and open-source alternatives.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
