Director, Engineering, Proactive Offense
Lead and scale Horizon3.ai's Offensive Engineering organization, overseeing teams responsible for exploit development, offensive content, and attack automation within the NodeZero platform. Set clear technical and product direction for how NodeZero identifies, exploits, and validates vulnerabilities across large, complex environments. Partner with Product, Precision Defense, and Platform teams to define and deliver offensive capabilities that influence the roadmap and enhance customer outcomes. Drive execution from proof-of-concept through production to transform cutting-edge attack research into scalable, productized features. Stay hands-on to guide architectural decisions and evaluate exploit and automation approaches, mentoring technical leads in building resilient, modular systems. Build, mentor, and scale diverse teams of software engineers, exploit developers, and offensive researchers, fostering a culture of collaboration, creativity, and engineering excellence that bridges offensive and product software development. Collaborate across engineering, product, and GTM teams to align offensive innovation with business priorities and ensure delivery of impactful capabilities for customers. This role is central to the mission of delivering continuous, autonomous security testing at scale.
Technical Lead Manager, Platform (India)
Lead the design and development of low latency, scalable, and reliable model inference and serving stack for SSM foundation models. Manage and mentor a team of platform engineers maintaining a high technical bar and strong engineering culture. Work closely with research and product teams to translate research into products. Own the architecture and roadmap for model serving infrastructure, distributed systems, and data processing platforms. Build highly parallel, high quality data processing and evaluation infrastructure for foundation model training. Drive execution across ambiguous, zero-to-one engineering projects and platform initiatives. Establish best practices for reliability, observability, scalability, and performance across platform systems. Help recruit, interview, and build the engineering team in India. Have significant autonomy to shape the platform and impact how AI is applied across devices and applications.
IC Agentic Engineering Manager - Stargate
Design and build agent-based systems to support infrastructure deployment and operations. Identify high-impact opportunities to apply agents across workflows such as cluster bring-up and deployment readiness, incident triage and root cause analysis, system validation and health monitoring, and capacity management and operational decision-making. Lead a small team while contributing directly as an individual contributor across system design, development, and integration. Partner with infrastructure, hardware, and networking teams to integrate agentic systems into production workflows. Develop systems that leverage telemetry, logs, and system signals to enable closed-loop automation. Define evaluation frameworks to measure system effectiveness, reliability, and operational impact. Drive iteration from prototype to production, ensuring robustness and scalability.
Senior Platform Engineer, Voice AI
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines, including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate RL and post-training pipelines that optimize algorithms and systems jointly, making workloads more efficient with inference-aware training loops and techniques such as async RL rollouts and speculative decoding. Use these pipelines to train, evaluate, and iterate on frontier models, and co-design algorithms and infrastructure tightly coupled to efficient inference. Run ablations and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights back into model, RL, and system design. Own critical systems at production scale by profiling, debugging, and optimizing inference and post-training services under real workloads. Drive roadmap items requiring engine modification, establish metrics, benchmarks, and experimentation frameworks to rigorously validate improvements. Provide technical leadership by setting technical direction for cross-team efforts at the intersection of inference, RL, and post-training, and mentor other engineers and researchers on full-stack ML systems and performance engineering.
Software Engineer, Workload Enablement
Port and validate key inference and training workloads on new platforms/SKUs as they arrive, driving correctness, performance, and stability to an internal readiness bar. Build a suite of benchmarks and stress tests that capture real end-to-end behavior of workloads by exercising all aspects of a system, including CPU, GPU, memory subsystem, frontend, scale-up, and scale-out networking, storage, thermals, and other relevant parts. Conduct deep-dive performance analysis on distributed training and inference focusing on collective performance and tuning, overlap of compute/communication, kernel-level bottlenecks, memory bandwidth, and scheduling effects. Create repeatable test harnesses that run in continuous integration and lab environments producing actionable outputs such as pass/fail, performance scores, and regression detection. Partner with systems and fleet bring-up engineers to ensure the platform is stable, performant, operationally usable, and scalable through containerization, Kubernetes integration, telemetry hooks, and failure triage loops. Work cross-functionally with vendors and internal stakeholders by producing clear bug reports, minimal reproductions, and prioritized issue lists.
Senior Software Engineer, Developer Experience (DevEx)
As a Software Engineer on the Developer Experience team, you will be responsible for creating frameworks and systems that maximize the velocity and efficiency of every engineer at Harvey. You will develop and scale a world-class developer platform to accelerate Harvey's growth, boosting velocity and stability through robust CI/CD systems, effective test frameworks, and reliable development environments. You will build load testing and benchmarking infrastructure essential for evaluating and optimizing the performance of AI-native applications. You will pioneer the future of software development and site reliability engineering by integrating AI agents across the software development, deployment, and maintenance lifecycle. You will collaborate with Backend Platform teams to embed testability, reliability, and observability into the platform, ensuring services built on the foundation are robust, easy to test, and maintain. You will work closely with engineering teams to gather feedback, evangelize best practices, and make the “paved road” a reality—empowering every Harvey engineer to move fast with confidence. You will also set the strategic direction and roadmap for scaling developer experience as Harvey expands, contribute strategically to team decision-making, and provide strong technical leadership and mentorship to uphold a high bar for engineering excellence across the team.
Software Engineer, Agent Architecture
Build the core systems that power agents including the Agent SDK such as the orchestration engine, runtime, and primitives that define how agents reason, take actions, and interact with users and systems. Design the agentic loop to build agents that are steerable, verifiable, conversational, and adaptive. Improve retrieval and grounding systems to ensure agents provide accurate and trustworthy responses by effectively retrieving and using knowledge. Build evaluation systems by designing frameworks that allow measurement and improvement of agent quality over time.
Senior Platform Engineer
The Senior Platform Engineer will own and advance the platform infrastructure stack that supports running autonomous agents safely in deployed customer environments. Responsibilities include managing sandboxing, isolation, monitoring, and safe operation of agent workloads at scale, covering execution environments, security boundaries, automated quality assurance, evaluation harnesses, and feedback loops to improve agent reliability. The role also involves working on core infrastructure such as Kubernetes, multi-account AWS, CI/CD, deployment strategies, observability including traces, metrics, logs, alerting, SLOs, disaster recovery, and cost management. Additionally, the engineer will handle security posture tasks including access controls, secrets management, network security, image scanning, dependency auditing, and compliance work like SOC2 as required by customers. All infrastructure will be defined, provisioned, and evolved through infrastructure-as-code.
Engineering Leader
As an Engineering Leader at Ema, you will build and lead a high-performance engineering organization by recruiting, hiring, and developing senior engineers across multiple sub-teams including cloud infrastructure, data platform, ML operations, and developer experience. You will establish engineering standards, a code review culture, on-call expectations, and promote a bias-toward-shipping mentality balanced with production rigor. You will coach and grow senior and staff engineers into technical leaders and manage engineering managers as the organization scales. Your responsibilities include setting the 6–18 month platform roadmap in partnership with engineering teams, making critical architectural decisions such as build versus buy and migration strategies, and driving cross-functional alignment with product, ML/AI research, and go-to-market teams. You will own production health for all platform services, including incident response, postmortems, SLO tracking, and capacity planning. Additionally, you will establish and refine engineering practices to maintain fast shipping without compromising reliability, and participate in executive-level reviews related to infrastructure spend, system health, and engineering velocity.
Senior Python Systems Developer - Functional Testing Project
Create functional black box tests for large codebases in various source languages, create and manage Docker environments to ensure 100% reproducible builds and test execution across different platforms, monitor code coverage and configure automated scoring criteria to meet industry benchmark-level standards, and leverage LLMs such as Roo Code and Claude to accelerate development cycles, automate repetitive tasks, and improve overall code quality.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
