Product Manager, Agent Harness & Modelling
Define and own the roadmap for North's agent harness, including the agent loop, context engineering layer, tool orchestration, sandbox execution, and sub-agent delegation. Serve as the primary interface between North engineering and Cohere's Modeling team, ensuring new harness capabilities are validated before being built and that neither team limits future possibilities. Own North's agentic evaluation framework, ensuring evaluations are compatible with both the North harness and Modeling's training infrastructure, serving as a reliable bridge between product and research. Engage enterprise customers to identify real-world agentic failures and translate findings into product and model requirements. Stay current with the open-source and commercial agent ecosystem and drive adoption decisions that align North's architecture with emerging standards.
Data Scientist (Python & SQL) - Freelance AI Trainer
As a Data Science AI Trainer at Mindrift, you will design original computational data science problems that simulate real-world analytical workflows in various industries such as telecom, finance, government, e-commerce, and healthcare. You will create problems that require Python programming using libraries like pandas, numpy, scipy, sklearn, statsmodels, matplotlib, and seaborn, ensuring these problems are computationally intensive and cannot be solved manually within reasonable timeframes. Your tasks include developing problems that require non-trivial reasoning chains involving data processing, statistical analysis, feature engineering, predictive modeling, and insight extraction. You will create deterministic problems with reproducible answers based on real business challenges, including customer analytics, risk assessment, fraud detection, forecasting, optimization, and operational efficiency. You will design end-to-end problems spanning the entire data science pipeline from data ingestion to deployment considerations, incorporating big data processing scenarios that require scalable computational approaches. You will verify solutions using Python with standard data science libraries and statistical methods, and you will document problem statements clearly with realistic business contexts while providing verified correct answers.
Automotive Engineering & Python Expert - Freelance AI Trainer
Contributors may design graduate- and industry-level automotive engineering problems grounded in real practice; evaluate AI-generated solutions for correctness, assumptions, and engineering logic; validate analytical or numerical results using Python (NumPy, SciPy, Pandas); improve AI reasoning to align with first principles and accepted engineering standards; and apply structured scoring criteria to assess multi-step problem solving.
Freelance Machine Learning Engineer
As a Machine Learning expert at Mindrift, you will design original computational STEM problems that simulate real scientific workflows, create problems requiring Python programming to solve, ensure problems are computationally intensive and cannot be solved manually within reasonable timeframes, develop problems that require non-trivial reasoning chains and creative problem-solving approaches, verify solutions using Python with standard libraries such as numpy, pandas, scipy, and sklearn, and document problem statements clearly while providing verified correct answers. You will collaborate on projects aimed at advancing GenAI models to address specialized questions and achieve complex reasoning skills.
Machine Learning Developer (Freelance)
As a Machine Learning expert at Mindrift, you will design original computational STEM problems simulating real scientific workflows that require Python programming to solve. You are expected to create problems that are computationally intensive and cannot be solved manually within reasonable timeframes (days/weeks), develop problems requiring non-trivial reasoning chains and creative problem-solving approaches, verify solutions using Python with standard libraries such as numpy, pandas, scipy, and sklearn, and document problem statements clearly while providing verified correct answers.
Statistics Expert (Python) - Freelance AI Trainer
Design rigorous statistics problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python (NumPy, Pandas, SciPy, Statsmodels, and Scikit-learn); improve AI reasoning to align with industry-standard logic; apply structured scoring criteria to multi-step problems.
GTM Engineer
The GTM Engineer is responsible for building internal systems that power how the company identifies demand, engages buying groups, accelerates deals, and scales revenue. This includes designing and shipping signal infrastructure, agent workflows, and orchestration tooling that convert GTM data into automated actions such as account intelligence, buying-stage detection, SDR alerts, personalization, and deal acceleration. Specific tasks include building GTM signal infrastructure to score ICP fit, map buying committees, and track engagement across accounts; capturing intent and engagement signals at scale; detecting buying stages and deal health; orchestrating automated GTM actions based on live engagement signals; building internal tooling and agent workflows to automate manual GTM workflows; partnering with marketing, sales, and revenue operations to translate GTM strategy into scalable automation systems; and maintaining data quality and governance across GTM systems.
Freelance AI Evaluation Engineer (Python/Full-Stack)
Create challenging coding test cases that push AI coding systems to their limits. Review and refine realistic coding tasks based on provided production codebases with realistic scope, requirements, and information sources. Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases, not just superficial checks. Craft fair but hard challenges where the AI has all the context it needs but must work for it, involving information scattered across files and external sources and requiring complex reasoning. Analyze AI failures to understand what the model struggles with versus what it masters. Iterate based on feedback from expert QA reviewers who score work on seven quality criteria.
Freelance AI Evaluation Engineer (Python/Full-Stack)
Create challenging coding test cases to push AI coding systems to their limits by reviewing and refining realistic coding tasks based on provided production codebases with realistic scope, requirements, and information sources. Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases. Craft challenges that are fair but hard, where the AI has all the context it needs, requiring complex reasoning with information scattered across files and external sources. Analyze AI failures to understand the model's struggles and strengths. Iterate based on feedback from expert QA reviewers who score work on seven quality criteria.
Software Engineer, AI Platform
Design and build abstractions and platform-level systems that improve all of Harvey’s agentic products; own infrastructure for model integration, routing, and evaluation that helps Harvey choose and deploy the right foundation model for any given context; build evaluation frameworks and tooling that let every team across Harvey iterate on AI quality effectively; partner closely with product engineering teams, PMs, and design to launch cutting-edge AI products; evaluate, prototype, and integrate the latest advancements in AI and agentic systems as they emerge.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
