The AI job market moves fast. We keep up so you don't have to.
Fresh roles added daily, reviewed for quality — across every corner of the AI ecosystem.
I'm strong in:
Edit filters
New AI Opportunities
Showing 61 – 79 of 79 jobs
Tag
Clinical AI Engineer
Heidi Health
201-500
Australia
Full-time
Remote
false
Who We AreHealthcare needs a better rhythm: one that keeps care continuous and deeply human. Heidi is building an AI Care Partner that works alongside clinicians to make that possible.We’re a team of doctors, engineers, designers, researchers, and creatives building tools that help clinicians stay focused on what matters most: their patients.In just 18 months, Heidi has given back more than 18 million hours to healthcare professionals — supporting 73 million patient visits in 116 countries. Today, more than two million patient visits each week are powered by Heidi worldwide.Backed by nearly $100 million in funding, we’re growing in the US, UK, Canada, and Europe, partnering with leading health systems including the NHS, Beth Israel Lahey Health, and Monash Health.The RoleWorking closely with the Product Lead, you will be an Engineer who operates at the intersection of core product development and clinical application.This role requires formal medical training and real clinical experience. Your clinical background will directly inform how we design, evaluate, and ship AI features that support real-world care delivery. Experience working on clinical AI products is highly valued, as you’ll be shaping systems that must perform safely in production environments.What you’ll do:Build end-to-end AI features: Architect and ship fullstack solutions (from React frontends to Python backend services) that leverage our voice AI and LLMs to automate clinical workflows.Operationalize Voice AI: Implement and fine-tune audio processing pipelines, ensuring our Automatic Speech Recognition (ASR) and LLM agents perform accurately in diverse, real-world medical environments.Bridge the gap between model and product: Translate complex feedback from clinicians into technical solutions, rapidly prototyping and deploying improvements to model behavior, prompting strategies, and audio handling.Optimise for real-time interaction: Tune fullstack performance to handle real-time audio streaming and token generation, minimizing latency so clinicians have a seamless conversational experience.Partner with implementation and clinical teams: Shorten the feedback loop by shipping critical integrations and feature requests from concept to production in days, not quarters.What we're looking for:Mastery of Fullstack fundamentals: You are equally proficient in Python and modern frontend frameworks (React/TypeScript), capable of owning a feature from the database schema to the UI interaction.Medical degree with clinical experience, and ideally experience working on clinical AI productsApplied AI & Voice fluency: You have a working knowledge of LLM integration (RAG, prompt engineering) and audio technologies (ASR, speech processing) and know how to build around their probabilistic nature.Pragmatic problem solving: You balance engineering purity with the need for speed; you know when to build a robust system and when to ship a tactical solution to unblock a customer.Cloud fluency (AWS or GCP): You can spin up your own infrastructure (containers, serverless functions) and manage CI/CD pipelines to get your code into the hands of users independently.Rigorous testing in production: You understand that "works on my machine" isn't enough; you implement observability and feedback loops to monitor how your AI features perform in the wild.Medical degree with clinical experience, and ideally experience working on clinical AI productsThe Way We Work1. Build to LastWe design for safety and reliability so clinicians, patients, and our teams can trust what we build every day.2. Own Your PracticeIdeas rise on merit, not title, and everyone shares responsibility for the standards we set together.3. Move Fast, Stay SteadyWe move quickly but never at the cost of trust. Progress only matters if people can depend on what we make.4. Make Others BetterHonest feedback, steady support, and shared growth keep our teams improving together.Why you will flourish with usFlexible hybrid working environment, with 3 days in the office.A generous personal development budget of $500 per annumLearn from some of the best engineers and creatives, joining a diverse teamBecome an owner, with shares (equity) in the company, if Heidi wins, we all winThe rare chance to create a global impact as you immerse yourself in one of Australia’s leading healthtech startupsIf you have an impact quickly, the opportunity to fast track your startup career!Heidi is dedicated to creating an equitable, inclusive, and supportive work environment that brings people together from diverse backgrounds, experiences, and perspectives. Our strength is in our differences. We're proud to be an equal opportunity employer and welcome all applicants as we're committed to promoting a culture of opportunity for all.
No items found.
2026-03-19 23:31
Senior Systems Performance Engineer
Crusoe
501-1000
$172,500 – $210,000
United States
Full-time
Remote
false
Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that — with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved — people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.Senior Systems Performance EngineerSan Francisco, Sunnyvale (Onsite)Role MissionAt Crusoe, we are pioneering the future of sustainable computing. We are seeking a Senior Performance Engineer to serve as a technical lead for the end-to-end hardware evaluation, reliability, and scaling of our AI infrastructure. You will be responsible for defining the performance roadmap of our next-generation cloud, ensuring that our SOTA (State-of-the-Art) AI models run with peak efficiency across diverse hardware architectures.What You’ll Be Working On:Architectural Strategy: Lead the evaluation and establishment of New Product Introduction (NPI) across varied hardware architectures, focusing on Bare Metal and VM environments.Full-Stack Optimization: Conduct deep-dive performance evaluations and workload characterizations across compute, memory, storage, and networking.Performance Modeling: Develop sophisticated multi-variable projection models and frameworks to analyze system design options through KPI tradeoffs, such as Power and TCO (Total Cost of Ownership).Hardware-Software Co-Design: Collaborate with external vendors to drive platform customization and optimize server/AI architectures for maximum performance-per-TCO.Infrastructure Scaling: Design and implement 0-to-1 performance methodologies that allow the team to scale evaluation processes for large-scale GPU/AI data centers.Industry Leadership: Actively engage in industry research and contribute technical insights to consortiums and standards committees to influence future hardware roadmaps.What You’ll Bring to the Team:5+ Years experience in end-to-end hardware evaluation, reliability, and scaling of our AI infrastructureLarge-Scale Systems: Proven experience in building and optimizing AI application systems for large-scale GPU infrastructure.Architecture & Microarchitecture: Deep knowledge of x86 and ARM architectures, including competitive analysis of microarchitecture and performance-based validation.Programming & Tooling: Expert-level proficiency in Python and C++. Experience with cycle-accurate simulators and hardware debuggers like Lauterbach Trace32 or ARM DS-5 is essential.Low-Level Systems: Ability to write and debug ARMv8 assembly, implement data synchronization protocols (MESI/MOESI), and analyze RTL via simulation waveforms.Security & HPC: Experience with performance modeling for secure environments (e.g., Intel SGX, TDX, VM Encryption) and high-performance computing benchmarks.Compensation:Compensation will be paid in the range of $172,500 - $210,000. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant’s education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data.Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.
No items found.
2026-03-19 17:46
Full Stack Engineer
Agent
201-500
$2,800 – $3,500 / month
Argentina
Full-time
Remote
false
Role: Full Stack Engineer (Mid-Level)Location: Latam - 100% RemoteSchedule: Flexible; US business hours overlap required.Compensation: $2,800–$3,500/monthAbout the CompanyOur client is a property management company building their own modern software platform to streamline how real estate owners, property managers, tenants, and service workers manage properties. From lease payments and digital keys to service requests and financial integrations, their platform simplifies the property management experience for everyone involved.They are a small, fast-moving team with big ambitions. You’ll work closely with the CTO and other internal stakeholders, and have a direct impact on the evolution of the platform.About the RoleThis is a hands-on engineering position focused on platform development with secondary responsibility for building AI agents to automate internal organizational processes. This role offers mentorship from a top-tier developer while contributing to a platform that will serve hundreds of properties.Key ResponsibilitiesPlatform DevelopmentBuild and maintain features for the web-based property management platform using TypeScript, React, Node.js, PostgreSQL, and AWS.Contribute to a monorepo architecture, working within two-week sprint cycles to deliver high-quality code.Implement critical integrations including DocuSign (lease signing), Plaid (bank verification), Stripe (payment processing), and ownership group payout systems.Optimize platform performance and user experience, replacing a legacy system plagued by 20+ second load times and inefficient manual workflows.AI & AutomationBuild and integrate AI agents using Claude and other AI APIs to automate slow organizational processes.Develop both API integrations on top of existing tools and custom agents built from scratch.Collaborate with the CEO on identifying and prioritizing automation opportunities across the organization.Problem-Solving & InitiativeTake ownership of assigned tasks and independently research and implement solutions when faced with unknown technologies or challenges.Proactively identify improvements and implement them with team guidance.Contribute ideas and solutions to platform architecture and development priorities.Must- Have Qualifications2-4 years of professional software development experience.Strong proficiency in TypeScript, React, and Node.js.Advance English language proficiency.Experience with PostgreSQL or similar relational databases.Familiarity with REST API design and integration.Exposure to Claude, n8n, or other tools for building agents or automations.Understanding of AWS (EC2, Lambda, RDS, or similar services).Experience working in agile/sprint-based environments.Excellent problem-solving skills and ability to learn independently.Nice-to-Have QualificationsExperience with monorepo architectures (Nx, Turborepo, or similar).Prior exposure to payment processing (Stripe) or banking integrations (Plaid).Knowledge of property management, rental, or real estate technology.What We're Looking ForA self-driven developer who can dive into unfamiliar problems and figure things out independently. You should be comfortable asking questions but equally comfortable researching solutions. We value developers who are proactive, bring their own ideas to the table, and take pride in code quality.
No items found.
2026-03-19 17:01
Freelance Web Scraping Engineer (Vibe Coding)
Mindrift
1001-5000
$32 / hour
Singapore
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English.Mindrift is looking for highly skilled Vibecode specialists to join the Tendem project (https://tendem.ai/) and drive specialized data scraping workflows within our hybrid AI + human system.In this role, as an AI Pilot – that’s how we refer to this role at Mindrift – you’ll collaborate with Tendem Agents that handle repetitive tasks, while you provide critical thinking, domain expertise, and quality control to deliver accurate and actionable results. This part-time remote opportunity is ideal for technical professionals with hands-on experience in web scraping, data extraction and processing.What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleThis is a freelance role for a Tendem project. As a Vibe Code specialist, you'll handle data scraping tasks requiring technical precision for web extraction and processing, utilizing various tools such as our provided Apify and OpenRouter alongside your own resourceful approaches. Key Responsibilities Own end-to-end data extraction workflows across complex websites, ensuring complete coverage, accuracy, and reliable delivery of structured datasets. Leverage internal tools (Apify, OpenRouter) alongside custom workflows to accelerate data collection, validation, and task execution while meeting defined requirements. Ensure reliable extraction from dynamic and interactive web sources, adapting approaches as needed to handle JavaScript-rendered content and changing site behavior. Enforce data quality standards through validation checks, cross-source consistency controls, adherence to formatting specifications, and systematic verification prior to delivery. Scale scraping operations for large datasets using efficient batching or parallelization, monitor failures, and maintain stability against minor site structure changes.CompensationOn this project, contributors can earn up to $32 per hour equivalent, depending on their level and pace of contribution.Compensation varies across projects depending on scope, complexity, and required expertise. Please note that other projects on the platform may offer different earning levels based on their requirements.How to get started Simply apply to this post, qualify, and get the chance to contribute to projects that match your technical skills, on your own schedule. From coding and automation to fine-tuning AI outputs, you’ll play a key role in advancing AI capabilities and real-world applications.RequirementsAt least 1 year of relevant experience in data analysis, AI automation, data engineering, or software development is desirable. Bachelor's or Master’s Degree in Engineering, Applied Mathematics, Computer Science, or related technical fields is a plus.Python web scraping: Build reliable scraping scripts using BeautifulSoup, Selenium (or equivalents) for multi-level sites, dynamic JS content (infinite scroll, AJAX), and API endpoints via provided proxy.Data extraction expertise: Navigate complex hierarchical structures (regions → companies → details), handling archived pages and varied HTML formats.Data processing: Clean, normalize, and validate scraped data; deliver high-quality datasets in well-structured formats (CSV, JSON, Google Sheets) with clear, consistent presentation.Hands-on experience with LLMs and AI frameworks to enhance automation and problem-solving.Strong attention to detail and commitment to data accuracy.Self-directed work ethic with ability to troubleshoot independently.English proficiency: Upper-intermediate (B2) or above (required).BenefitsWhy this freelance opportunity might be a great fit for you? Work fully remote on your own schedule with just a laptop and stable internet connection. Gain hands-on experience in a unique hybrid environment where human expertise and AI agents collaborate seamlessly — a distinctive skill set in a rapidly growing field. Participate in performance-based bonus programs that reward high-quality work and consistent delivery.
No items found.
2026-03-19 16:46
Research Engineer – Matilda
Maincode
11-50
Australia
Full-time
Remote
false
Maincode is mission-focused. That means we care about shipping Matilda, and we care about the people we do it with. Everything else is secondary.Matilda is Australia's first publicly available conversational AI platform wholly built and run in Australia. It's a serious technical undertaking and we're building it with a small team that moves fast and takes the work seriously.We're looking for smart, humble people who want to help build something that matters and aren't precious about how they contribute. You might end up deep in the research literature. You might end up learning how to build telemetry systems for production infrastructure at scale. Probably both, and other things we haven't thought of yet. The work will find you.We don't care much about credentials or background. A lot of the best people we've worked with came from somewhere unexpected — hard PhDs in adjacent fields, unconventional paths, weird combinations of experience. What they had in common was that they were genuinely excellent, fast learners, and good people who showed up and got on with it.If you want a detailed job spec, this probably isn't the right role. If you're the kind of person who reads that and feels relieved rather than concerned, let's talk.
No items found.
2026-03-19 16:16
AI QA Analyst
Ryz Labs
51-100
Argentina
Contractor
Remote
false
At Ryz Labs, we are looking for an AI QA Analyst to join one of our clients, an AI-first travel and hospitality company. In this role, you will act as the bridge between model development and user experience. Your primary goal is to ensure our generative models are accurate, safe, and contextually aware by implementing rigorous evaluation frameworks and data-driven insights.
Qualifications:
-Ability to spot subtle logical inconsistencies or factual inaccuracies in AI-generated content.
-Experience managing, cleaning, or labeling datasets for machine learning or NLP projects.
-Familiarity with the AI lifecycle, Prompt Engineering, and tools like Google Sheets/Excel (SQL or Python is a significant plus).
Key Responsibilities:
-Review and annotate complex conversation traces to rate response quality based on metrics such as helpfulness, honesty, and harmlessness (HHH).
-Build and maintain high-quality "Golden Datasets" and benchmarks to stress-test the model across various domains and edge cases.
-Conduct pre-deployment testing and A/B model comparisons to identify performance regressions or improvements.
-Categorize model failures (hallucinations, logic errors, tone drift) to provide actionable feedback to the Engineering and Research teams.
-Help define and refine the rubric for "what a good response looks like" as the product evolves.
About RYZ Labs:
RYZ Labs is a startup studio founded in 2021 by two lifelong entrepreneurs. The founders of RYZ have worked at some of the world's largest tech companies and some of the most iconic consumer brands. They have lived and worked in Argentina for many years and have decades of experience in Latam. What brought them together was their passion for the early phases of company creation and the idea of attracting the brightest talents in order to build industry-defining companies in a post-pandemic world.
Our teams are remote and distributed throughout the US and Latam. They use the latest cutting-edge cloud computing technologies to create scalable and resilient applications. We aim to provide diverse product solutions for different industries and plan to build a large number of startups in the upcoming years.
At RYZ, you will find yourself working with autonomy and efficiency, owning every step of your development. We provide an environment of opportunities, learning, growth, expansion, and challenging projects. You will deepen your experience while sharing and learning from a team of great professionals and specialists.
Our values and what to expect:
- Customer First Mentality - Every decision we make should be made through the lens of the customer.
- Bias for Action - urgency is critical, expect that the timeline to get something done is accelerated.
- Ownership - Step up if you see an opportunity to help, even if it's not your core responsibility.
- Humility and Respect - Be willing to learn, be vulnerable, and treat everyone who interacts with RYZ with respect.
- Frugality - being frugal and cost-conscious helps us do more with less
- Deliver Impact - get things done most efficiently.
- Raise our Standards - always be looking to improve our processes, our team, and our expectations. The status quo is not good enough and never should be.
No items found.
2026-03-19 14:16
Senior AI Engineer
Ryz Labs
51-100
Argentina
Contractor
Remote
false
Remote position only for candidates in Latam.
At Ryz Labs we are looking for an AI Engineer for one of our clients. We’re looking for someone who doesn’t write every line of code, but uses taste and judgment to determine what to build, then spins up parallel AI agents — each producing real output — while focusing on strategy and quality. The best builders today are 3–5x more productive than they were a year ago. The median builder is up 10–20%. We want someone in that top 5%
What You’ll Build• Agent-driven enrollment and parent communication pipelines that scale from hundredsto tens of thousands of families without linear headcount growth.• 10,000 simulated students testing our curriculum in parallel — stress-testing content,surfacing gaps, and generating improvements before real students ever see it.• Automated culture and community agents — building engagement, onboarding, andretention systems that feel human but run at machine scale.• Real-time operational dashboards that give leadership visibility into every part of thebusiness: enrollment, academic progress, parent satisfaction, campus operations.• AI-first workflows for guides, advisors, and ops staff — freeing them from administrativeburden so they can focus on students.• Brainlifts that capture institutional knowledge into AI systems that compound over time— the competitive moat.• Integration into Alpha’s broader AI ecosystem (EPHOR, Alpha GPTs, Fleet/Swarminfrastructure)
What We’re Looking For
Required• Has shipped production AI systems — agents, automation, LLM-powered workflows. Notprototypes. Not research.• Power user of Claude Code, Cursor, or equivalent. Already commanding agent fleets,not planning to “learn AI.”• Fluent in the fleet commander model: spins up parallel agents, delegates with clearintent, reviews output with taste and judgment.• Builds personally. This is a hands-on-keyboard role, not a people management role.• Enough product sense to know what to automate vs. what stays human in an educationcontext.• Moves before certainty. Sees problems nobody assigned and fixes them.• Comfortable in a startup environment — ambiguity, speed, resource constraints.
Nice to Have• Experience scaling ops at an education company or high-growth consumer startup.• Familiarity with Alpha/Trilogy ecosystem and tooling.• Background in curriculum or content systems at scale.• Experience building real-time dashboards and business intelligence automation.
No items found.
2026-03-19 14:16
Data Strategy Associate
Figure AI
201-500
$150,000 – $250,000
No items found.
Full-time
Remote
false
Figure is an AI Robotics company developing a general purpose humanoid. Our humanoid robot is designed for commercial tasks and the home. We are based in San Jose, CA and require 5 days/week in-office collaboration. It’s time to build.
Figure’s vision is to deploy autonomous humanoids at a global scale. Our Helix team is seeking an experienced AI Tooling Engineer to enhance our internal, web-based data and AI training tools. This role focuses on developing intuitive web interfaces that support key AI research functions, including robot data annotation, training dataset visualization, and experiment tracking. The ideal candidate has experience building rich, interactive web interfaces using React and TypeScript.
Responsibilities
Design and build intuitive web interfaces for robot data annotation, datasets visualization, and experiment tracking
Utilize data-driven techniques to optimize interfaces for efficiency and fast iteration cycles
Integrate AI models to automate manual tasks
Work together with AI researchers, robot operators, and annotators to support new user experiences
Requirements
Strong software engineering fundamentals
Bachelor's or Master's degree in Computer Science, Robotics, Engineering, or a related field
Minimum of 4 years of professional, full-time experience building rich, interactive web interfaces
Proficiency in React and TypeScript
Bonus Qualifications
Experience using data stores (Postgres, MySQL, ElasticSearch, Redis, etc.)
Experience managing cloud infrastructure (AWS, Azure, GCP)
Experience with Tailwind CSS
Experience building data annotation and dataset management tools.
The US base salary range for this full-time position is between $150,000 - $250,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
No items found.
2026-03-19 9:46
Technical Director of AI Safety
Faculty
501-1000
United Kingdom
Full-time
Remote
false
Why Faculty?
We established Faculty in 2014 because we thought that AI would be the most important technology of our time. Since then, we’ve worked with over 350 global customers to transform their performance through human-centric AI. You can read about our real-world impact here.We don’t chase hype cycles. We innovate, build and deploy responsible AI which moves the needle - and we know a thing or two about doing it well. We bring an unparalleled depth of technical, product and delivery expertise to our clients who span government, finance, retail, energy, life sciences and defence.Our business, and reputation, is growing fast and we’re always on the lookout for individuals who share our intellectual curiosity and desire to build a positive legacy through technology.AI is an epoch-defining technology, join a company where you’ll be empowered to envision its most powerful applications, and to make them happen.About the TeamFaculty’s Research team conducts critical red teaming and builds evaluations for misuse capabilities in sensitive areas, such as CBRN, cybersecurity and international security, for several leading frontier model developers and national safety institutes; notably, our work has been featured in OpenAI's system card for o1.Our commitment also extends to conducting fundamental technical research on mitigation strategies, with our findings published in peer-reviewed conferences and delivered to national security institutes. Complementing this, we design evaluations for model developers across broader safety-relevant fields, including the societal impacts of increasingly capable frontier models, showcasing our expertise across the safety landscape.About the roleThis is a brand new senior leadership role to provide technical leadership of Faculty's work on AI safety for the Foundation Labs - and presents a unique opportunity to shape how AI safety is done globally.Faculty is one of the world's leading applied AI companies, helping many of the organisations that shape our world to adopt AI successfully and safely. We play an important role in the emerging AI safety ecosystem. We already have many of the key Frontier Labs as clients, including Open AI and Anthropic, for whom we provide third-party red teaming, technical testing and other AI safety services. And we work with the UK government and other international governments on AI safety, including helping set up the AI Security Institute and delivering technical work which catalysed the first global AI Safety Summit at Bletchley Park in 2023.With the recent announcement of Faculty's acquisition by Accenture, we are investing to take our work on AI safety to global scale, and this role will be key to shaping that. This will include:The opportunity to hire and build a world-class AI safety technical team - of calibre unmatched outside of the Labs themselvesThe opportunity to design and lead an AI safety R&D programme - creating the advances which will enable AI safety at scale to keep pace with model advancesThe opportunity to build our work with the Frontier Labs to scale - helping to test and assure new frontier models ahead of public releaseThe opportunity to contribute to and shape the international debate on AI safety, including with governments and other key bodies, working closely with Marc Warner Faculty's founder & CEO.This role will suit someone with a deep passion and commitment to AI safety, and represents a unique opportunity to contribute to this agenda globally.What you'll be doing:Owning the technical strategy for AI Safety by determining research directions and building technologies that mitigate risks from alignment to societal harms.Leading a high-performing R&D team through intentional hiring, mentorship, and the cultivation of a culture defined by technical excellence and high output.Driving academic impact by guiding complex machine learning projects and securing top-tier publications that cement Faculty’s reputation in the safety domain.Shaping market-leading offerings for frontier labs and security institutes, translating cutting-edge R&D into practical, groundbreaking safety solutions.Overseeing technical delivery of AI safety and security projects, ensuring scientific rigor and high-quality outputs across evaluations and red-teaming.Representing Faculty externally as a primary technical voice, delivering influential thought leadership and speaking at major global industry events.Collaborating cross-functionally with business unit directors and commercial teams to align research investment with strategic growth and client needs.Who we're looking for:You have a proven track record of designing and leading high-performing technical teams, with the ability to manage R&D budgets and mentor senior technical staff.You bring deep expertise in AI safety research, specifically regarding alignment, interpretability, and robustness in large language models (LLMs) or safety-critical systems.You possess a strong scientific background evidenced by high-impact machine learning publications and a comprehensive understanding of transformer architectures.You are a strategic visionary capable of setting research priorities that align with long-term organisational goals while remaining at the cutting edge of field developments.You are a compelling communicator who can synthesise complex technical concepts into narratives that influence both C-suite executives and the broader research community.You exhibit strong commercial acumen and stakeholder management skills, allowing you to navigate complex organisations and accelerate the delivery of high-value projects.Interview ProcessTalent Team Screen (45 mins)Principles and Experience interview (60 mins)Research Proposal (90 mins)Leadership Interview (60 mins)Meet with CEO (30 mins)Our Recruitment EthosWe aim to grow the best team - not the most similar one. We know that diversity of individuals fosters diversity of thought, and that strengthens our principle of seeking truth. And we know from experience that diverse teams deliver better work, relevant to the world in which we live. We’re united by a deep intellectual curiosity and desire to use our abilities for measurable positive impact. We strongly encourage applications from people of all backgrounds, ethnicities, genders, religions and sexual orientations.Some of our standout benefits:Unlimited Annual Leave PolicyPrivate healthcare and dentalEnhanced parental leaveFamily-Friendly Flexibility & Flexible workingSanctus CoachingHybrid WorkingIf you don’t feel you meet all the requirements, but are excited by the role and know you bring some key strengths, please don't hesitate in applying as you might be right for this role, or other roles. We are open to conversations about part-time hours.
No items found.
2026-03-19 9:16
Senior Software Engineer, Agents
Decagon
101-200
$250,000 – $350,000
United States
Full-time
Remote
false
About DecagonDecagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.About the TeamThe Agent Engineering team at Decagon deploys mission-critical AI agents to our customers that impact millions of users and directly drive Decagon’s growth. You will build on our industry-leading AI agent platform, collaborate directly with customers and use your own creativity to devise long-term, scalable solutions.Our mission is to deliver magical support experiences — AI agents working alongside human agents to help users resolve their issues.About the RoleOn the Agent Engineering team, you’ll have complete ownership and autonomy in building and shipping best-in-class AI agents, from initial implementation through continuous iteration. You’ll work directly with leaders across industries like finance, healthcare and hospitality, solving their users’ needs with reliable and intuitive AI agents.Engineers here own their work end-to-end and are trusted to make a real impact. This role is for someone who dives deep into complex system challenges and builds elegant solutions that scale to millions of users.In this role, you willDesign and build AI agents that outperform human agents in managing complex customer interactions and driving customer retentionIdentify cross-customer trends that guide the evolution of Decagon’s agent building platform and research effortsExperiment with and run evaluations on the latest text and voice models, then integrate them at scale with large enterprise-grade customersYour background looks something like thisHave 5+ years of industry experience in software engineeringProficiency with Python, Typescript and asynchronous programmingA high degree of comfort digging into system failures within deep technology stacks using any tool necessaryEven betterPrior experience working with multi-modal modelsBenefitsMedical, dental, and vision benefitsTake what you need vacation policyDaily lunches, dinners and snacks in the office to keep you at your bestCompensation$250K – $350K + Offers Equity
No items found.
2026-03-19 9:02
Senior Software Engineer, Agents
Decagon
101-200
$250,000 – $330,000
United States
Full-time
Remote
false
About DecagonDecagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.About the TeamThe Agent Engineering team at Decagon deploys mission-critical AI agents to our customers that impact millions of users and directly drive Decagon’s growth. You will build on our industry-leading AI agent platform, collaborate directly with customers and use your own creativity to devise long-term, scalable solutions.Our mission is to deliver magical support experiences — AI agents working alongside human agents to help users resolve their issues.About the RoleOn the Agent Engineering team, you’ll have complete ownership and autonomy in building and shipping best-in-class AI agents, from initial implementation through continuous iteration. You’ll work directly with leaders across industries like finance, healthcare and hospitality, solving their users’ needs with reliable and intuitive AI agents.Engineers here own their work end-to-end and are trusted to make a real impact. This role is for someone who dives deep into complex system challenges and builds elegant solutions that scale to millions of users.In this role, you willDesign and build AI agents that outperform human agents in managing complex customer interactions and driving customer retentionIdentify cross-customer trends that guide the evolution of Decagon’s agent building platform and research effortsExperiment with and run evaluations on the latest text and voice models, then integrate them at scale with large enterprise-grade customersYour background looks something like thisHave 5+ years of industry experience in software engineeringProficiency with Python, Typescript and asynchronous programmingA high degree of comfort digging into system failures within deep technology stacks using any tool necessaryEven betterPrior experience working with multi-modal modelsBenefitsMedical, dental, and vision benefitsTake what you need vacation policyDaily lunches, dinners and snacks in the office to keep you at your bestCompensation$250K – $330K + Offers Equity
No items found.
2026-03-19 9:01
Staff Applied AI Engineer - Pre-Sales
Snorkel AI
501-1000
$172,000 – $300,000
United States
Full-time
Remote
false
About Snorkel
At Snorkel, we believe meaningful AI doesn’t start with the model, it starts with the data.
We’re on a mission to help enterprises transform expert knowledge into specialized AI at scale. The AI landscape has gone through incredible changes between 2015, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!About Snorkel
We’re on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!
As an Applied AI Engineer, you’ll research and utilize state-of-the-art Gen AI and machine learning (ML) techniques to successfully deliver solutions to our customers. You will work directly with our customers to understand their business and technical needs and design and deliver AI solutions to solve them - either by leveraging Snorkel Flow or developing custom approaches when needed. You will also help define Snorkel’s Applied AI tooling by translating repeatable real-world challenges into reusable solution recipes, workflows, best practices, and platform-level capabilities that become part of Snorkel Flow’s next generation of AI tooling. We move fast and are constantly prototyping and innovating new ways to deliver value to our customers. This position is ideal for someone who enjoys solving complex problems, bridging the gap between AI technology and business value, working directly with customers, keeping up-to date with AI research, and standardizing bespoke solutions into internal recipes and staying naturally curious about the infrastructure that underpin the Applied AI stack end-to-end.
Main Responsibilities
Partner with customers to build and deploy impactful Gen AI and machine learning solutions, from use case scoping and data exploration to model development and deployment. This may involve leveraging Snorkel Flow or designing custom approaches using state-of-the-art tools, with the goal of delivering real business value and informing the evolution of the Snorkel platform.
Develop and implement state of the art AI systems such as retrieval-augmented generation (RAG), fine-tuning pipelines, prompt engineering recipes and agentic workflows.
Create augmented real-world datasets and comprehensive evaluation workflows to ensure model reliability, transparency, and stakeholder trust. A data- and evaluation-first mindset is essential for success in this role.
Forge and manage relationships with our customers’ leadership and stakeholders to ensure successful development and deployment of AI projects with Snorkel Flow.
Collaborate closely with pre-sales Solutions and Product teams to map customer needs to existing capabilities, prioritize roadmap gaps, and guide successful project setup.
Work with other Applied AI Engineers to standardize solutions and contribute to internal tooling and best practices.
Lead stakeholder education on quantitative capabilities, helping them to understand the strengths and weaknesses of different approaches and what problems are best-suited for Snorkel AI.
Serve as the voice of our customers for new AI paradigms, data science workflows, and share customer feedback to product teams.
Conduct one-to-few and one-to-many enablement workshops to transfer knowledge to customers considering or already using Snorkel AI.
Annual travel up to 25%.
Preferred Qualifications
B.S. degree in a quantitative field such as Computer Science, Engineering, Mathematics, Statistics, or comparable degree/experience.
3+ years of customer-facing experience in the design and implementation of AI/ML solutions.
Proficiency in Python, including strong grounding in software engineering fundamentals (e.g., modular design, testing, profiling, packaging) and experience with modern Python constructs and libraries for type validation and typed data modeling (e.g., pydantic), building type-safe systems (e.g., mypy), testing (e.g., pytest), packaging and environment configuration (e.g., poetry), API and service frameworks (e.g., FastAPI), serialization and structured data handling (e.g., msgspec), and orchestration tooling relevant to ML deployment (e.g., Ray, Airflow).
Expertise across the Applied AI stack, spanning classical ML libraries (e.g., scikit-learn), deep learning frameworks (e.g., PyTorch), foundation-model ecosystems (e.g., Hugging Face Transformers), vector/embedding tooling (e.g., FAISS), data processing frameworks (e.g., pandas, Spark), retrieval/RAG tooling (e.g., Chroma, Weaviate), synthetic dataset curation, evaluation workflows, and LLM orchestration, workflow, agent authoring tools (e.g., LlamaIndex, LangGraph, CrewAI).
Experience leading strategic, customer-facing initiatives and collaborating with business stakeholders to ensure ML solutions drive successful business outcomes, with a strong focus on teaching and enablement.
Outstanding presentation skills to technical and executive audiences, whether impromptu on a whiteboard or using presentations and demos.
Ability to work in a fast-paced environment and balance priorities across multiple projects at once.
Compensation range for Tier 1 locations of San Francisco Bay Area $172K - $300K OTE. All offers also include equity in the form of employee stock options. Our compensation ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.
Locations
Redwood City, CA - Hybrid; San Francisco, CA - Hybrid - US; New York, NY - Hybrid
#LI-CG1Salary Range $172,000—$300,000 USDBe Your Best at Snorkel
Joining Snorkel AI means becoming part of a company that has market proven solutions, robust funding, and is scaling rapidly—offering a unique combination of stability and the excitement of high growth. As a member of our team, you’ll have meaningful opportunities to shape priorities and initiatives, influence key strategic decisions, and directly impact our ongoing success. Whether you’re looking to deepen your technical expertise, explore leadership opportunities, or learn new skills across multiple functions, you’re fully supported in building your career in an environment designed for growth, learning, and shared success.
Snorkel AI is proud to be an Equal Employment Opportunity employer and is committed to building a team that represents a variety of backgrounds, perspectives, and skills. Snorkel AI embraces diversity and provides equal employment opportunities to all employees and applicants for employment. Snorkel AI prohibits discrimination and harassment of any type on the basis of race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local law. All employment is decided on the basis of qualifications, performance, merit, and business need.
We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
No items found.
2026-03-19 8:46
Expansion Account Executive
Arize AI
101-200
Argentina
Full-time
Remote
false
About Arize
AI is rapidly transforming the world. As generative AI reshapes industries, teams need powerful ways to monitor, troubleshoot, and optimize their AI systems. That’s where we come in. Arize AI is the leading AI & Agent Engineering observability and evaluation platform, empowering AI engineers to ship high-performing, reliable agents and applications. From first prototype to production scale, Arize AX unifies build, test, and run in a single workspace—so teams can ship faster with confidence.
We’re a Series C company backed by top-tier investors, with over $135M in funding and a rapidly growing customer base of 150+ leading enterprises and Fortune 500 companies. Customers like Booking.com, Uber, Siemens, and PepsiCo leverage Arize to deliver AI that works.Note: The nature of this role requires candidates to be based in the Buenos Aires area, though there isn't an in-office requirement.
The Opportunity
We’re looking for an Application Engineer who thrives on solving hard problems with code. In this role, you'll have the opportunity to work at the cutting edge of generative AI in a high-impact role with autonomy and ownership.
What You’ll Do
Debug and fix issues in our platform (and ship PRs with your fixes).
Build internal tools and copilots powered by generative AI to supercharge our team.
Rapidly prototype proof-of-concepts for customer use cases.
Work across Engineering, Product, and Solutions to unblock customers and push the boundaries of AI adoption.
What We’re Looking For
You have 2-5 years of experience in software.
Strong in Python and Golang; comfortable shipping fixes in production systems.
Hands-on with generative AI (LLM APIs, frameworks, building copilots or automations)
Hands-on with OpenTelimetry and deep familiarity with distributed tracing concepts.
Familiarity with AI frameworks (CrewAI, Langchain, Langgraph, DiFy, LiteLLM, etc).
Familiarity or eagerness to learn JavaScript/TypeScript.
Great debugger, creative problem solver, and fast learner.
Independent and resourceful. You create solutions, not dependencies.
Bonus Points (but not required!)
Experience in a customer-facing role
Built copilots, plugins, or custom GenAI-powered applications.
Open-sourced or contributed PRs to real codebases.
Startup or fast-moving environment experience.
Actual compensation is determined based upon a variety of job related factors that may include: transferable work experience, skill sets, and qualifications. Total compensation also includes unlimited paid time off, generous parental leave plan, and others for mental and wellness support.More About Arize
Arize’s mission is to make the world’s AI work—and work for people.
Our founders came together through a shared frustration: while investments in AI are growing rapidly across every industry, organizations face a critical challenge—understanding whether AI is performing and how to improve it at scale.
Learn more about what we're doing here:
https://techcrunch.com/2025/02/20/arize-ai-hopes-it-has-first-mover-advantage-in-ai-observability/
https://arize.com/blog/arize-ai-raises-70m-series-c-to-build-the-gold-standard-for-ai-evaluation-observability/
Diversity & Inclusion @ Arize
Our company's mission is to make AI work and make AI work for the people, we hope to make an impact in bias industry-wide and that's a big motivator for people who work here. We actively hope that individuals contribute to a good culture
Regularly have chats with industry experts, researchers, and ethicists across the ecosystem to advance the use of responsible AI
Culturally conscious events such as LGBTQ trivia during pride month
We have an active Lady Arizers subgroup
No items found.
2026-03-19 8:17
DevOps Engineer, Infrastructure & Security
Scale AI
5000+
United States
Full-time
Remote
false
Role Overview
Scale’s rapidly growing Global Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of:
Creating custom AI applications that will impact millions of citizens
Generating high-quality training data for national LLMs
Upskilling and advisory services to spread the impact of AI
As a Production AI Ops Lead, you will design and develop the production lifecycle of full-stack AI applications, while supporting end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and the resilient cloud infrastructure required for our international government partners.
At Scale, we’re not just building AI solutions—we’re enabling the public sector to transform their operations and better serve citizens through cutting-edge technology. If you’re ready to shape the future of AI in the public sector and be a founding member of our team, we’d love to hear from you.
You will:
Own the production outcome: Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies.
Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment.
Scale the feedback loop: Build automated systems to monitor model performance and data drift across geographically dispersed environments, ensuring the right levels of reliability.
Navigate global compliance: Manage the technical lifecycle within diverse regulatory frameworks.
Incident command: Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again.
Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials.
Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases.
Ideally, you have:
Experience: 6+ years in a high-impact technical role (SRE, FDE or MLOps) with experience in the public sector.
Global perspective: Familiarity with international government security standards and the complexities of deploying sovereign AI.
System architecture proficiency: Proven experience maintaining production-grade applications with a deep understanding of the full request lifecycle-connecting frontend/API layers to the backend and AI core.
Modern AI Stack expertise: Proficiency in coding and the modern AI infrastructure, including Kubernetes, vector databases, agentic development, and LLM observability tools.
Ownership: You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them.
Reliability: You understand that in the public sector, a model failure may be a risk to public safety or privacy.
Customer communication: The ability to explain to a high-ranking official why the performance of the system has degraded and how we are fixing it.
PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.
About Us:
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Cisco, DLA Piper, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at accommodations@scale.com. Please see the United States Department of Labor's Know Your Rights poster for additional information.
We comply with the United States Department of Labor's Pay Transparency provision.
PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants’ needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
No items found.
2026-03-19 8:16
Copy of Member of Technical Staff - ML Engineering
Talent Labs
11-50
United Kingdom
Full-time
Remote
false
The opportunityWe are looking for a highly skilled machine learning research engineer with significant experience in training and implementing large scale generative models. In this role you will manage our high performance computing environment, and our model serving initiatives. You will join an interdisciplinary team of machine learners, protein engineers and biologists, jointly working to change the way that we control biology and cure diseasesWho we areAt Latent Labs, we are building frontier models that learn the fundamentals of biology. We pursue ambitious goals with curiosity and are committed to scientific excellence. Before building Latent Labs, our team co-developed DeepMind’s Nobel-prize winning AlphaFold, invented latent diffusion, and built pioneering lab data management systems as well as high throughput protein screening platforms. At Latent Labs you will be working with some of the brightest minds in generative AI and biology.Our team is committed to interdisciplinary exchange, continuous learning and collaboration. Team offsites help us foster a culture of trust across our London and San Francisco sites.We’re looking for innovators passionate about tackling complex challenges and maximizing positive global impact. Join us on our moonshot mission.Who you areDeep experience with Kubernetes and containerized workflowsExperience with major cloud platforms (AWS, GCP, Azure)Knowledge of DevOps and related tools (Terraform, etc)knowledge of HPC frameworks (Slurm, Ray, etc)Production engineering & reliability experiencePyTorch & distributed computing experienceYour ResponsibilitiesDeploy, maintain, and optimize production and research compute clustersDesign and implement scalable and efficient ML inference solutionsDevelop dynamic / heterogeneous compute solutions for balancing research and production needsContribute to productizing model APIs for external useDevelop infrastructure observability and monitoring solutionsApplyWe offer strongly competitive compensation and benefits packages, including:Private health insurancePension/401(K) contributionsGenerous leave policies (including gender neutral parental leave)Hybrid workingTravel opportunities and moreWe also offer a stimulating work environment, and the opportunity to shape the future of synthetic biology through the application of breakthrough generative models.We welcome applicants from all backgrounds and we are committed to building a team that represents a variety of backgrounds, perspectives, and skills.
No items found.
2026-03-19 8:16
Machine Learning and State Estimation Intern
Harmattan AI
51-100
Switzerland
Intern
Remote
false
About UsHarmattan AI is a next-generation defense prime building autonomous and scalable defense systems. Following the close of a $200M Series B, valuing the company at $1.4 billion, we are expanding our teams and capabilities to deliver mission-critical systems to allied forces.Our work is guided by clear values: building technologies with real-world impact, pursuing excellence in everything we do, setting ambitious goals, and taking on the hardest technical challenges. We operate in a demanding environment where rigor, ownership, and execution are expected.About the RoleWe are developing advanced autonomous systems that rely on robust state estimation and sensor fusion to operate in complex, dynamic environments. Our platforms integrate multiple sensors (e.g., IMU, GNSS, vision, barometer, magnetometer) and require accurate, real-time estimation of system states (position, velocity, attitude, etc).Classical approaches such as Kalman filtering are powerful but rely on modeling assumptions that often break down in real-world conditions. To push performance beyond these limits, we are exploring hybrid approaches that combine model-based estimation and control with modern machine learning techniques.Your missionThe goal of this internship is to explore and apply machine learning-based sensor fusion and state estimation methods to improve performance in dynamic environments.ResponsibilitiesLiterature review: Conduct a comprehensive review of existing ML methods for state estimation and sensor fusion.Algorithm Implementation: Develop and implement various algorithms based on the literature review and project requirements using simulated and real-world flight data.Performance evaluation: Assess and compare the performance and computational overhead of the developed algorithms with classical baselines.Documentation: Document all work performed, including methodologies, results, and conclusions.Flight Tests Participation: Actively participate in flight test sessions to gather real-world data and validate the effectiveness of the developed algorithms in operational conditions. Contribute to real-time deployment.Candidates RequirementsEducational Background: A strong academic record in applied Mathematics (especially machine learning). Knowledge of sensor fusion/state estimation is a strong plus.Technical Skills: Strong understanding of ML fundamentals. Experience with State Estimation, drones, or Control Theory is a major plus.Mindset: You are curious to learn, autonomous and able to take initiative.We look forward to hearing how you can help shape the future of autonomous defense systems at Harmattan AI.
No items found.
2026-03-19 8:16
Product Manager, Agent Harness & Modelling
Cohere
501-1000
Canada
Full-time
Remote
false
Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.Join us on our mission and shape the future!About Cohere and NorthCohere is revolutionizing enterprise AI with North, an agentic AI platform designed to securely deploy AI agents and automations within organizations' infrastructure. North empowers employees to streamline workflows, automate repetitive tasks, and unlock actionable insights while ensuring data privacy and compliance. North combines cutting-edge generative and search models with customizable integrations to drive productivity and innovation at scale.Role OverviewWe are seeking an Agent Harness Product Manager to own the execution layer that makes North agents reliable, capable, and production-ready. This is a role that sits at the intersection of three domains:Agent Loop and Execution: Own the core agent runtime: tool orchestration, parallel execution, sub-agent delegation, sandbox code execution, and failure recovery. You will define how North agents plan and act across long, multi-step workflows and ensure the execution environment is robust enough for the most demanding enterprise tasks. You are expected to engage at the implementation level, contributing to architecture decisions alongside engineering rather than simply handing off requirements.Context Engineering: Own how our Agents manage the context window as a deliberately controlled resource. This includes progressive disclosure of tools and skills, context compaction and summarization, offloading of large payloads to a persistent filesystem, and the instrumentation that keeps agents oriented across extended trajectories.Model-Scaffolding Co-evolution: Own the feedback loop between North's harness and the Modeling Team. This PM is the connective tissue that makes that possible: ensuring harness design decisions are validated by Modeling before they are built, that evals are the shared bridge between both teams, and that as the harness evolves the model evolves with it.ResponsibilitiesDefine and own the roadmap for North's agent harness, including the agent loop, context engineering layer, tool orchestration, sandbox execution, and sub-agent delegationServe as the primary interface between North engineering and Cohere's Modeling team, ensuring new harness capabilities are validated before being built and that neither team paints itself into a cornerOwn North's agentic evaluation framework, ensuring evals are compatible with both the North harness and Modeling's training infrastructure, and that they serve as a reliable bridge between product and researchEngage enterprise customers to surface real-world agentic failures and translate findings into concrete product and model requirementsStay current with the open-source and commercial agent ecosystem and drive adoption decisions that keep North's architecture aligned with emerging standards.Requirements5+ years of product management experience in agentic AI systems, developer infrastructure, or applied ML productsDeep understanding of modern LLM agent architectures, including multi-agent systems, tool-augmented reasoning, memory and retrieval, programmatic orchestration, RAG, and long-horizon executionStrong grasp of agentic evaluation design, including how to measure task completion, failure recovery, and long-horizon reliability, and how to diagnose model vs. scaffolding gapsTechnically deep enough to contribute to architecture decisions at the implementation level: comfortable reviewing and shaping design docs, reasoning about async execution patterns, sandboxed environments, filesystem design, and the tradeoffs that come with building harness capabilities into a production platformAbility to flex between ML research conversations and engineering architecture discussions with equal fluencyTrack record of shipping platform-layer products with demonstrated impact on reliability, performance, or capability.Nice-to-HavesAn active practitioner of agent frameworks who regularly builds with and follows the latest developments in open-source harnesses, coding agents, and orchestration tools in both professional and personal workHands-on experience with enterprise agentic deployments: multi-tenant orchestration, tool permissioning, audit trails, and compliance requirementsFamiliarity with infrastructure constraints relevant to enterprise deployments: on-premises environments, scalability challenges, and the operational tradeoffs of running complex agent workloads in restricted or air-gapped settingsPrior work at the intersection of research and product, translating nascent model capabilities into shipped product featuresBackground working within or closely alongside an ML research or post-training teamWhy Join Cohere?Impact: Shape how Canada's most important public institutions adopt and deploy frontier AI.Innovation: Work alongside leading researchers and engineers solving complex ML challenges.Growth: Competitive compensation, equity options, and opportunities for professional development.Flexibility: Hybrid work model with offices in key global locations (Toronto, Montreal, New York, San Francisco, London, Paris, and Korea)If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.Full-Time Employees at Cohere enjoy these Perks:🤝 An open and inclusive culture and work environment 🧑💻 Work closely with a team on the cutting edge of AI research 🍽 Weekly lunch stipend, in-office lunches & snacks🦷 Full health and dental benefits, including a separate budget to take care of your mental health 🐣 100% Parental Leave top-up for up to 6 months🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement🏙 Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend✈️ 6 weeks of vacation (30 working days!)
No items found.
2026-03-19 8:16
Research Engineer – Benchmarking, Evals & Failure Analysis
Mercor
1001-5000
$130,000 – $500,000
United States
Full-time
Remote
false
About MercorMercor is defining the future of work. We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development.Our vast talent network trains frontier AI models in the same way teachers teach students: by sharing knowledge, experience, and context that can't be captured in code alone. Today, more than 30,000 experts in our network collectively earn over $2 million a day.Mercor is creating a new category of work where expertise powers AI advancement. Achieving this requires an ambitious, fast-paced and deeply committed team. You’ll work alongside researchers, operators, and AI companies at the forefront of shaping the systems that are redefining society.Mercor is a profitable Series C company valued at $10 billion. We work in-person five days a week in our new San Francisco headquarters.About the RoleAs a Research Engineer at Mercor, you’ll work at the intersection of engineering and applied AI research. You’ll own benchmarking pipelines, evaluation systems, and failure analysis workflows that directly inform how we train and improve frontier language models.
Your work will define how we measure tool use, agentic behavior, and real-world reasoning. You’ll design and run evals, build rubrics and scorers, and turn failure analysis into actionable improvements for post-training, RLVR, and data pipelines.What You’ll DoBenchmarking: Design, implement, and maintain benchmarks and metrics for tool use, agentic behavior, and real-world reasoning; ensure benchmarks scale with training and stay aligned with product and research goals.Evaluation systems: Build and operate LLM evaluation systems end-to-end runs, scoring, dashboards, and reporting, so researchers and applied AI teams can track model performance and compare runs at scale.Failure analysis: Run systematic failure analysis on model outputs (e.g., wrong tool use, reasoning errors, safety/alignment issues); categorize failure modes, quantify prevalence, and feed findings into reward design, data curation, and benchmark design.Rubrics and evaluators: Create and refine rubrics, automated evaluators, and scoring frameworks that drive training and evaluation decisions; balance rigor with scalability (human vs. model-as-judge, calibration, agreement).Data quality and usability: Quantify data usability, quality, and impact on key benchmarks; use evals and failure analysis to guide data generation, augmentation, and curation.Cross-team collaboration: Work with AI researchers, applied AI teams, and data producers to align evals with training objectives and to prioritize benchmarks and failure analyses that matter most.Ownership in a fast-paced environment: Operate in a high-iteration research setting with strong ownership of benchmarks, evals, and failure-analysis workflows.What We’re Looking ForStrong applied research background, with focus on model evaluation, benchmarking, and/or failure analysis.Strong coding skills and hands-on experience with ML models and evaluation code.Solid grasp of data structures, algorithms, and backend systems.Comfort with APIs, SQL/NoSQL, and cloud platforms for running and storing eval results.Ability to reason about model behavior, experimental results, and data quality from evals and failure analyses.Excitement to work in person in San Francisco five days a week in a high-intensity, high-ownership environment.Nice To HaveIndustry experience on a post-training or evaluation/benchmarking team (highest priority).Publications at top-tier venues (NeurIPS, ICML, ACL), especially in evaluation or benchmarking.Experience building or running LLM evaluations, benchmarks, or failure-analysis pipelines.Experience with synthetic data generation, rubric design, or RL-style workflows that use evals for reward shaping.Work samples or code (e.g., eval frameworks, benchmark suites, failure-analysis reports or tooling) that demonstrate relevant skills.BenefitsGenerous equity grant vested over 4 yearsA $10K housing bonus (if you live within 0.5 miles of our office)A $1.5K monthly stipend for mealsFree Equinox membershipHealth insurance
No items found.
2026-03-19 8:16
Field Engineering Manager, Public Sector
Scale AI
5000+
United States
Full-time
Remote
false
Role Overview
Scale’s rapidly growing Global Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of:
Creating custom AI applications that will impact millions of citizens
Generating high-quality training data for national LLMs
Upskilling and advisory services to spread the impact of AI
As a Production AI Ops Lead, you will design and develop the production lifecycle of full-stack AI applications, while supporting end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and the resilient cloud infrastructure required for our international government partners.
At Scale, we’re not just building AI solutions—we’re enabling the public sector to transform their operations and better serve citizens through cutting-edge technology. If you’re ready to shape the future of AI in the public sector and be a founding member of our team, we’d love to hear from you.
You will:
Own the production outcome: Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies.
Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment.
Scale the feedback loop: Build automated systems to monitor model performance and data drift across geographically dispersed environments, ensuring the right levels of reliability.
Navigate global compliance: Manage the technical lifecycle within diverse regulatory frameworks.
Incident command: Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again.
Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials.
Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases.
Ideally, you have:
Experience: 6+ years in a high-impact technical role (SRE, FDE or MLOps) with experience in the public sector.
Global perspective: Familiarity with international government security standards and the complexities of deploying sovereign AI.
System architecture proficiency: Proven experience maintaining production-grade applications with a deep understanding of the full request lifecycle-connecting frontend/API layers to the backend and AI core.
Modern AI Stack expertise: Proficiency in coding and the modern AI infrastructure, including Kubernetes, vector databases, agentic development, and LLM observability tools.
Ownership: You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them.
Reliability: You understand that in the public sector, a model failure may be a risk to public safety or privacy.
Customer communication: The ability to explain to a high-ranking official why the performance of the system has degraded and how we are fixing it.
PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.
About Us:
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Cisco, DLA Piper, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at accommodations@scale.com. Please see the United States Department of Labor's Know Your Rights poster for additional information.
We comply with the United States Department of Labor's Pay Transparency provision.
PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants’ needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
No items found.
2026-03-19 8:01
Mechanical Engineer & Python Expert - Freelance AI Trainer
Mindrift
1001-5000
$12 / hour
India
Part-time
Remote
false
Please submit your CV in English and indicate your level of English proficiency. Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation isproject-based, not permanent employment.What this opportunity involves While each project involves unique tasks, contributors may: Design graduate- and industry-level mechanical engineering problems grounded in real practice.Evaluate AI-generated solutions for correctness, assumptions, and engineering logic.Validate analytical or numerical results using Python (NumPy, SciPy, Pandas).Improve AI reasoning to align with first principles and accepted engineering standards.Apply structured scoring criteria to assess multi-step problem solving. What we look for This opportunity is a good fit for mechanical engineers with an experience in python open to part-time, non-permanent projects. Ideally, contributors will have: Degree in Mechanical Engineering or related fields, e.g. Thermodynamics, Fluid Mechanics, Mechanical Design, Computational Mechanics, etc. 3+ years of professional mechanical engineering experience Strong written English (C1/C2) Strong Python proficiency for numerical validation Stable internet connection Professional certifications (e.g., PE, CEng, PMP) and experience in international or applied projects are an advantage.How it works Apply → Pass qualification(s) → Join a project → Complete tasks → Get paidProject time expectations For this project, tasks are estimated to require around 10–20 hours per week during active phases, based on project requirements. This is an estimate, not a guaranteed workload, and applies only while the project is active. Payment Paid contributions, with rates up to $12/hour* Fixed project rate or individual rates, depending on the project Some projects include incentive payments *Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.
No items found.
2026-03-18 11:32
No job found
Your search did not match any job. Please try again
