Staff Software Engineer, Foundations (Managed AI)
As a Staff Software Engineer in the Foundations department, responsibilities include leading the design and implementation of highly scalable systems for the Managed AI offerings, driving the long-term technical roadmap for the Foundations team to support growth and evolving AI workloads, working cross-functionally with Cloud Engineering to align technical goals and solve integration challenges, leading by example through high-quality code contributions and mentoring Senior and Staff-level engineers, championing reliability, observability, and performance by identifying and resolving systemic bottlenecks, and staying current with AI infrastructure trends to ensure efficient and powerful tools are utilized.
Product Engineer
Implement and integrate AI functionality into key product features, craft and iterate on prompts to improve LLM reliability and usefulness, build AI-powered flows that feel intuitive and responsive to developers, evaluate and test AI outputs to ensure performance and accuracy, work alongside engineers to deliver robust, production-grade code, stay current with LLM tools, APIs, and best practices, deliver reliable, high-quality AI-powered product experiences, translate product needs into technical AI implementations, tune and test prompts for real-world use cases and developer workflows, collaborate closely with engineers and researchers, and contribute across frontend, backend, and integration layers.
Senior Platform/DevOps Engineer (Kubernetes-Linux)
Translate business requirements into requirements for AI/ML models; prepare data to train and evaluate AI/ML/DL models; build AI/ML/DL models by applying state-of-the-art algorithms, especially transformers; leverage existing algorithms from academic or industrial research when applicable; test, evaluate, and benchmark AI/ML/DL models and publish the models, data sets, and evaluations; deploy models in production by containerizing them; work with customers and internal employees to refine model quality; establish continuous learning pipelines for models using online or transfer learning; build and deploy containerized applications on cloud or on-premise environments.
Tech Lead Manager, Data Infrastructure
The Tech Lead Manager, Data Infrastructure at Cartesia is responsible for defining the overall multi-modal data strategy across pre-training and post-training, including human, synthetic, and web-scale data sources. They lead, manage, and mentor a team of data engineers and specialists. They design and oversee the construction of robust, scalable data pipelines for text, audio, and video, establish and enforce rigorous standards for data quality across the organization, deeply understand how data affects model capability and proactively identify and source novel datasets, and manage relationships and budgets with external data vendors and partners.
Director, Engineering, Proactive Offense
Lead and scale Horizon3.ai's Offensive Engineering organization, overseeing teams responsible for exploit development, offensive content, and attack automation within the NodeZero platform. Set clear technical and product direction for how NodeZero identifies, exploits, and validates vulnerabilities across large, complex environments. Partner with Product, Precision Defense, and Platform teams to define and deliver offensive capabilities that influence the roadmap and enhance customer outcomes. Drive execution from proof-of-concept through production to transform cutting-edge attack research into scalable, productized features. Stay hands-on to guide architectural decisions and evaluate exploit and automation approaches, mentoring technical leads in building resilient, modular systems. Build, mentor, and scale diverse teams of software engineers, exploit developers, and offensive researchers, fostering a culture of collaboration, creativity, and engineering excellence that bridges offensive and product software development. Collaborate across engineering, product, and GTM teams to align offensive innovation with business priorities and ensure delivery of impactful capabilities for customers. This role is central to the mission of delivering continuous, autonomous security testing at scale.
Agentic Solution Engineer
Partner with Account Executives to discover and scope customer challenges, designing high-value technical solutions that showcase the ROI of Netomi’s platform. Architect and build agentic workflows that integrate generative AI with APIs, databases, and enterprise tools to power experiences for customers' end users. Develop custom demonstrations, prototypes, and proofs of concept using the Netomi platform tailored to specific clients' use cases. Design, test, and refine prompts and AI orchestration chains to optimize performance, reasoning, and reliability across varied use cases. Communicate complex technical concepts clearly and persuasively to audiences ranging from C-level executives to hands-on engineers. Collaborate with product and engineering teams, contributing insights from customer engagements to inform roadmap priorities. Document and present solution designs, workflows, and technical configurations for both internal and client-facing reference.
Salesforce Technical Architect
The Salesforce Technical Architect is responsible for designing and delivering scalable, enterprise-grade cloud solutions across the Salesforce ecosystem, including Sales Cloud, Service Cloud, Experience Cloud, Data Cloud, and Agentforce AI agents. The role includes architecting AI-enabled workflows and autonomous agents, designing enterprise data strategies leveraging Salesforce Data Cloud, and defining scalable enterprise integration architectures using MuleSoft or other iPaaS platforms. The architect provides technical leadership for development teams working with Apex, Lightning Web Components, Visualforce, and Salesforce automation frameworks, and may contribute hands-on when necessary. They ensure solutions are high performance, scalable, secure, and capable of handling large data volumes. The role involves partnering with Delivery Managers, consultants, and client stakeholders to translate business requirements into technical architectures, leading design and code reviews, and mentoring developers and consultants. Additionally, the architect evaluates and implements Salesforce AI capabilities, drives innovation initiatives, serves as a technical advisor to clients on architecture and AI adoption, contributes to internal architecture standards and reusable frameworks, and supports presales and solutioning efforts for complex Salesforce engagements.
US Sales and Partnerships Lead, Digital Diagnostics
Lead the team responsible for the AI/ML Stack infrastructure that bridges ML research and production, evolving the stack to meet large scale ML training and inference workload needs. Develop and execute a long-term vision and roadmap for the MLOps team to support ML development and deployment needs across business units, managing short-term deliveries and long-term architectural transformation. Lead and mentor a team of 6-7+ engineers, strategically allocate resources for support and strategic initiatives. Collaborate cross-functionally with leaders in machine learning, data science, product engineering, and infrastructure to identify pain points, address bottlenecks, and facilitate deployment of new solutions. Architect compute and storage pipelines to manage millions of slides and complex artifacts without data fragmentation or latency. Modernize AI product inference stack to support substantial growth in AI runs globally. Work with Site Reliability Engineering to establish comprehensive system observability metrics including compute utilization, network bottlenecks, and cost attribution. Conduct build versus buy assessments and lead stack refresh audits to benchmark proprietary tools against commercial and open-source alternatives.
Machine Learning Engineer
Design, develop, and deploy end-to-end machine learning pipelines, ensuring efficiency in training, validation, and inference. Implement MLOps best practices, including CI/CD for ML models, model versioning, monitoring, and retraining strategies. Optimize ML models using feature engineering, hyperparameter tuning, and scalable inference techniques. Work with structured and unstructured data, leveraging Pandas, NumPy, and SQL for efficient data manipulation. Apply machine learning design patterns to build modular, reusable, and production-ready models. Collaborate with data engineers to develop high-performance data pipelines for training and inference. Deploy and manage models on cloud platforms (AWS, GCP, Azure) with containerization and orchestration tools like Docker and Kubernetes. Maintain model performance by implementing continuous monitoring, bias detection, and explainability techniques.
Lead/Manager Site Reliability Engineering Team (Amsterdam)
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines such as SGLang- or vLLM-style systems and Together's inference stack, including kernel backends, speculative decoding methods like ATLAS, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Unify inference with RL/post-training by designing and operating RL and post-training pipelines where inference constitutes the majority of the cost, optimizing algorithms and systems jointly. Enhance RL and post-training workloads with inference-aware training loops, including asynchronous RL rollouts and speculative decoding techniques, making large-scale rollout collection and evaluation more efficient. Use these pipelines to train, evaluate, and iterate on cutting-edge models based on the inference stack. Co-design algorithms and infrastructure to tightly couple objectives, rollout collection, and evaluation to efficient inference, and identify bottlenecks across training engines, inference engines, data pipelines, and user-facing layers quickly. Run ablation and scale-up experiments to analyze trade-offs between model quality, latency, throughput, and cost, feeding insights back into model, RL, and system design. Own critical production-scale systems by profiling, debugging, and optimizing inference and post-training services under real production workloads. Lead roadmap initiatives necessitating engine modifications such as changes to kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to rigorously validate improvements. Provide technical leadership by setting direction for cross-team efforts at the intersection of inference, RL, and post-training and mentor engineers and researchers on full-stack ML systems work and performance engineering.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Need help with something? Here are our most frequently asked questions.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
