AI Applied Research Scientist Jobs

Discover the latest remote and onsite AI Applied Research Scientist roles across top active AI companies. Updated hourly.

Check out 59 new AI Applied Research Scientist opportunities posted on AI Chopping Block

Researcher, Context - Agent Post-Training

New
Top rated
OpenAI
Full-time
Full-time
Posted

As a Context Researcher on the Agent Post-Training team, the role involves designing and running experiments to improve the scaling of compute on context. The researcher will own end-to-end improvements to the post-training stack, including reinforcement learning, data pipelines, graders, reward signals, evaluations, diagnostics, and model-behavior analysis. Responsibilities include building evaluations and environments to identify model failures and turning those failures into training data, product fixes, or new research directions. The researcher will partner with Codex and ChatGPT product teams to translate product signals into model improvements and work on early-training and alignment interventions such as data mixtures, objectives, synthetic data, and evaluation loops to shape downstream agent behavior. The role involves deciding which integrations, capabilities, and fixes are ready for major model runs, improving machinery for large-scale training and launch including experiment velocity, reliability, observability, reproducibility, cost, latency, and production readiness. The researcher will take on cross-functional projects involving model training, product infrastructure, and the production agent harness and debug failures in shipped or near-shipped models by developing hypotheses, experiments, and fixes from qualitative behaviors.

$250,000 – $380,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Remote

Researcher, Connectors - Agent Post-Training

New
Top rated
OpenAI
Full-time
Full-time
Posted

As a member of Agent Post-Training, Connectors, you will teach models how to interface with professional software using code, helping train agents to use code, APIs, tools, and structured integrations to operate across applications like Slack, Google Workspace, GitHub, Notion, Linear, Salesforce, and other core systems. You will design and run experiments to improve agentic model behavior for complex software and plugins, own end-to-end improvements to the post-training stack including RL, data pipelines, graders, reward signals, evaluations, diagnostics, and model behavior analysis, and build evaluations and environments that expose model failures to turn those failures into training data, product fixes, or new research directions. You will partner with product teams to understand user needs and translate product signals into model improvements, work on early-training and alignment interventions such as data mixtures, objectives, synthetic data, and evaluation loops, and decide which integrations and capabilities to include in major model runs. Additionally, you will improve large-scale training and launch infrastructure for experiment velocity, reliability, observability, reproducibility, cost, latency, and production readiness, take on cross-functional projects touching model training, product infrastructure, and the production agent harness, and debug failures in shipped or near-shipped models to develop concrete hypotheses, experiments, and fixes.

$250,000 – $380,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Remote

Researcher, Computer Use - Agent Post-Training

New
Top rated
OpenAI
Full-time
Full-time
Posted

As a member of Agent Post-Training, Computer Use, you will teach models to operate computers, helping to train models that can navigate browsers and desktops, use tools and applications, reason through complex workflows, collaborate with users and other agents, and complete long-horizon tasks with reliability and judgment. Responsibilities include designing and running experiments to improve agentic model behavior for complex computer use, owning end-to-end improvements to the post-training stack such as reinforcement learning, data pipelines, graders, reward signals, evaluations, diagnostics, and model-behavior analysis. You will build evaluations and environments to identify model failures and convert those into training data, product fixes, or research directions. The role involves partnering with product teams to understand user needs and translate product signals into model improvements, working on early-training and alignment interventions, deciding on suitable integrations and fixes for major model runs, and improving large-scale training and launch machinery regarding experiment velocity, reliability, observability, reproducibility, cost, latency, and production readiness. You will also handle cross-functional projects involving model training, product infrastructure, and production agent harness, debug failures in shipped or near-shipped models, and transform qualitative model behavior into concrete hypotheses, experiments, and fixes.

$250,000 – $380,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite

Researcher, Artifacts - Agent Post-Training

New
Top rated
OpenAI
Full-time
Full-time
Posted

As a member of Agent Post-Training, Artifacts, the role involves training frontier models to produce polished, useful work products such as documents, spreadsheets, slide decks, dashboards, reports, analyses, and other interactive or editable artifacts. Responsibilities include designing and running experiments to improve agentic model behavior for complex software and plugins, owning end-to-end improvements to the post-training stack including reinforcement learning, data pipelines, graders, reward signals, evaluations, diagnostics, and model-behavior analysis. The role involves building evaluations and environments to identify new model failures and converting these failures into training data, product fixes, or new research paths. Collaboration with Codex and ChatGPT product teams to translate product signals into model improvements is required. Other duties include working on early-training and alignment interventions, deciding integration and capability readiness for major model runs, improving machinery for large-scale training and launch regarding experiment velocity, reliability, observability, reproducibility, cost, latency, and production readiness, and undertaking cross-functional projects that involve model training, product infrastructure, and production agent systems. Debugging hard failures in shipped or near-shipped models and transforming qualitative behaviors into hypotheses, experiments, and fixes is also part of the role.

$250,000 – $380,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Remote

Psychometrician (Internship)

New
Top rated
MakiPeople
Full-time
Full-time
Posted

As a psychometrician intern, you will help design, validate, and improve AI-powered assessments, working at the crossover between psychometrics and machine learning. Your day-to-day duties include building and validating psychometric assessments designed for AI integration, running analyses on item quality, fairness, reliability, and scoring accuracy, helping develop and refine automated scoring algorithms, exploring how large language models (LLMs) can be used to generate and evaluate assessment content, translating findings into clear insights for clients and internal teams, and contributing to research reports and potentially academic publications.

Undisclosed

()

Paris, France
Maybe global
Remote

Senior Consultant - AI Training & Evaluation (MBB & Top-Tier Firms)

New
Top rated
Mindrift
Part-time
Full-time
Posted

Build realistic consulting project environments by creating detailed project scenarios grounded in real engagement dynamics including industry context, financials, constraints, conflicting inputs, and incomplete information. Design structured consulting tasks for AI agents by breaking projects into discrete tasks that mirror real consulting work such as market sizing, commercial due diligence, cost optimization, growth strategy, operational diagnosis, and benchmarking. Define evaluation criteria and quality standards by developing grading frameworks, evaluation rubrics, and golden-answer solutions for each task to train and calibrate an LLM-based grading system that evaluates AI outputs at scale. This role is remote, project-based, and focused on analytical design and evaluation as an individual contributor.

$60 / hour
Undisclosed
HOUR

(USD)

Czechia
Maybe global
Remote

Senior Consultant - AI Training & Evaluation (MBB & Top-Tier Firms)

New
Top rated
Mindrift
Part-time
Full-time
Posted

Build realistic consulting project environments including detailed project scenarios grounded in real engagement dynamics such as industry context, financials, constraints, conflicting inputs, and incomplete information. Design structured consulting tasks for AI agents by breaking projects into discrete tasks that mirror real consulting work including market sizing, commercial due diligence, cost optimization, growth strategy, operational diagnosis, benchmarking, and more. Define evaluation criteria and quality standards by developing grading frameworks, evaluation rubrics, and golden-answer solutions for each task to train and calibrate an LLM-based grading system that evaluates AI outputs at scale. This role is remote, project-based, and focused on analytical design and evaluation as an individual contributor.

$60 / hour
Undisclosed
HOUR

(USD)

United States
Maybe global
Remote

Senior Consultant - AI Training & Evaluation (MBB & Top-Tier Firms)

New
Top rated
Mindrift
Part-time
Full-time
Posted

Build realistic consulting project environments by creating detailed project scenarios grounded in real engagement dynamics such as industry context, financials, constraints, conflicting inputs, and incomplete information. Design structured consulting tasks for AI agents by breaking projects into discrete tasks that mirror real consulting work including market sizing, commercial due diligence, cost optimization, growth strategy, operational diagnosis, benchmarking, and more. Define evaluation criteria and quality standards by developing grading frameworks, evaluation rubrics, and golden-answer solutions for each task, which are used to train and calibrate an LLM-based grading system that evaluates AI outputs at scale. This is a remote, project-based, individual-contributor role focused on analytical design and evaluation.

$60 / hour
Undisclosed
HOUR

(USD)

United States
Maybe global
Remote

Senior Consultant - AI Training & Evaluation (MBB & Top-Tier Firms)

New
Top rated
Mindrift
Part-time
Full-time
Posted

Build realistic consulting project environments by creating detailed project scenarios grounded in real engagement dynamics, such as industry context, financials, constraints, conflicting inputs, and incomplete information. Design structured consulting tasks for AI agents that mirror real consulting work, including market sizing, commercial due diligence, cost optimization, growth strategy, operational diagnosis, and benchmarking. Define evaluation criteria and quality standards by developing grading frameworks, evaluation rubrics, and golden-answer solutions for each task to train and calibrate an LLM-based grading system that evaluates AI outputs at scale. This role is remote, project-based, and individual-contributor focused on analytical design and evaluation.

$60 / hour
Undisclosed
HOUR

(USD)

United States
Maybe global
Remote

Senior Consultant - AI Training & Evaluation (MBB & Top-Tier Firms)

New
Top rated
Mindrift
Part-time
Full-time
Posted

Build realistic consulting project environments by creating detailed project scenarios grounded in real engagement dynamics including industry context, financials, constraints, conflicting inputs, and incomplete information. Design structured consulting tasks for AI agents by breaking projects into discrete tasks that mirror real consulting work such as market sizing, commercial due diligence, cost optimization, growth strategy, operational diagnosis, benchmarking, and more. Define evaluation criteria and quality standards by developing grading frameworks, evaluation rubrics, and golden-answer solutions for each task, which are used to train and calibrate an LLM-based grading system that evaluates AI outputs at scale. This is a remote, project-based, individual-contributor role focused on analytical design and evaluation.

$60 / hour
Undisclosed
HOUR

(USD)

United Kingdom
Maybe global
Remote

Want to see more AI Applied Research Scientist jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Have questions about roles, locations, or requirements for AI Applied Research Scientist jobs?

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What does a AI Applied Research Scientist do?","answer":"AI Applied Research Scientists lead research initiatives to develop new AI methodologies and algorithms. They design experiments, build prototypes, and create proof-of-concepts to test innovative AI systems. Their work involves implementing cutting-edge techniques in areas like computer vision or NLP, collaborating with engineers to transition research into production, and publishing findings in academic journals. These researchers bridge the gap between theoretical AI advancements and practical applications for specific domains."},{"question":"What skills are required for AI Applied Research Scientist?","answer":"Essential skills for this role include expertise in machine learning frameworks, proficiency in Python with libraries like PyTorch, LangChain, and Streamlit, and the ability to implement algorithms from scratch. Strong research design capabilities and problem-solving skills are crucial. Experience with deep learning, computer vision, or NLP is highly valued. Additionally, excellent communication abilities for interdisciplinary collaboration and technical documentation are necessary in AI research positions."},{"question":"What qualifications are needed for AI Applied Research Scientist role?","answer":"Most employers require a Master's degree at minimum, with a PhD preferred, in Computer Science, Electrical Engineering, or related technical fields. Candidates typically need at least 3 years of hands-on experience in AI/ML research and deep learning algorithms. Demonstrated expertise in specific domains like computer vision is often expected. The ability to handle ambiguous research areas and collaborate effectively across teams is essential beyond academic credentials."},{"question":"What is the salary range for AI Applied Research Scientist job?","answer":"While specific salary figures aren't available in the research provided, AI Applied Research Scientist positions generally command premium compensation due to their specialized expertise and advanced education requirements. Salaries typically vary based on factors including location (with tech hubs paying more), years of research experience, publication history, domain specialization (like computer vision or NLP), and whether the role is in industry or academia."},{"question":"How long does it take to get hired as a AI Applied Research Scientist?","answer":"The hiring process for AI Applied Research Scientist positions typically takes 1-3 months. It often involves multiple interview rounds including technical assessments, research presentations, and discussions with cross-functional teams. The timeline may extend if the role requires specialized domain expertise or if candidates need to demonstrate their research capabilities through sample projects. Educational requirements (PhD preferred) also lengthen the career preparation timeline considerably."},{"question":"Are AI Applied Research Scientist job in demand?","answer":"Yes, AI Applied Research Scientist jobs are in high demand across industries as organizations seek experts who can translate theoretical AI advancements into practical applications. The specialized skill set combining deep technical expertise with implementation capabilities makes qualified candidates particularly valuable. While exact numbers aren't provided in the research, the position's critical role in developing new AI methodologies and bridging research-to-production gaps drives consistent hiring needs."}]