Research Intern – Reinforcement Learning (RL)
Design and build reinforcement learning environments that model real-world customer interaction workflows. Design reinforcement learning agents that learn from these environments using real-world interaction data, rewards, and feedback loops. Define reward models and feedback loops using real-world signals (outcomes and human feedback). Enable learning from production data by structuring interaction traces into training-ready datasets for offline and online learning. Experiment with multi-agent systems and simulation frameworks for complex coordination and decision-making. Collaborate with engineering and product teams to deploy, evaluate, and iterate on learning systems in production at scale.
Recruiting Programs & Operations Manager
Lead the research and development of novel deep learning algorithms that enable robots to perform complex, contact-rich manipulation tasks. Explore the intersection of computer vision and robotic control by designing systems that allow robots to perceive and interact with objects in dynamic environments. Create models integrating visual data to guide physical manipulation beyond simple grasping to sophisticated handling of diverse items. Collaborate with a multidisciplinary team to translate concepts into robust capabilities deployable on physical hardware for industrial applications. Research and develop deep learning architectures for visual perception and sensorimotor control in contact-rich scenarios. Design algorithms enabling robots to manipulate complex or deformable objects with high precision. Collaborate with software engineers to optimize and deploy research prototypes onto physical robotic hardware. Evaluate model performance in simulation and real-world environments to ensure robustness and reliability. Identify opportunities to apply state-of-the-art advancements in computer vision and robot learning to practical industrial problems. Mentor junior researchers and contribute to the technical direction of the manipulation research roadmap.
Research Scientist, Agent Robustness
As a Production AI Ops Lead, you will design and develop the production lifecycle of full-stack AI applications, support end-to-end system reliability, ensure real-time inference observability, handle sovereign data orchestration, integrate high-security software, and maintain resilient cloud infrastructure for international government partners. You will take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies, oversee the end-to-end health of the platform ensuring seamless integration between the AI core and full-stack components (APIs to UI) to maintain a responsive production-ready environment, build automated systems to monitor model performance and data drift across geographically dispersed environments for reliability, manage the technical lifecycle within diverse regulatory frameworks, lead incident response for production issues in mission-critical environments with rapid resolution and prevention guardrails, translate technical performance metrics into clear insights for senior international government officials, and partner with Engineering and ML teams to incorporate field lessons into the technical architecture and decisions of future use cases.
Safety Research Internship (Spring/Summer 2026)
As a Cohere Research Intern, you will conduct cutting-edge machine learning research, training and evaluating production large language models. You will focus on research projects aimed at making models better understood, safer, more reliable, more inclusive, and more beneficial for the world. You will disseminate your research results through the production of publications, datasets, and code. Additionally, you will contribute to research initiatives that have practical applications in Cohere's product development. The internship involves collaborating with the Modelling Safety team on implementing novel research ideas related to fairness, safety (including for multiple languages, dialects, and cultural contexts), robustness, generalisation, interpretability, safety for agents with complex read/write actions, and safety for codegen. The project details and topic will be designed collaboratively between the intern and the team, with a goal to publish a paper in a top venue and contribute to open science. The internship may be remote or onsite, with no relocation or housing provided.
AI Researcher
You will work across the model development loop, from research questions to training runs to evaluation. This includes designing and testing architecture changes and training regimes for large language models, running controlled experiments at scale and isolating causal effects, studying failure modes in reasoning, generalisation, robustness, and representation, shaping objectives, data mixtures, and optimisation choices that influence model behaviour, building and refining evaluations that measure capability and reliability, analysing training dynamics using logs, metrics, and model outputs, collaborating with ML systems engineers on distributed training and training operations, and writing clear internal notes that turn experimental results into design decisions. You will spend substantial time in code, training runs, logs, and evaluation outputs with the goal of clarity about what improves the model and why. You will work hands-on with code as a primary tool for thinking, moving between theory and implementation quickly and precisely, preferring controlled experiments over broad sweeps, using logs, metrics, and model behaviour to guide decisions, and working closely with engineering counterparts to scale and validate ideas.
Senior AI Researcher- Reinforcement learning (f/m/d)
As a senior AI Researcher for reinforcement learning, you will shape and improve the underlying reinforcement learning methodology, maintain a high-quality training codebase, and conduct large-scale experiments to improve performance benchmarks. Your responsibilities include conducting large-scale LLM training runs, analyzing evaluation scores, proposing and implementing improvements, staying at the forefront of reinforcement learning research by identifying and iterating on novel approaches, optimizing RL training loops to scale training infrastructure, and collaborating cross-functionally with other post-training teams to convert feedback into actionable training signals for measurable improvements in performance.
Research Scientist, PhD
Conduct original research to advance the state of the art in machine learning and artificial intelligence. Design, implement, and evaluate novel algorithms, models, or training approaches at large scale. Collaborate with researchers and engineers to translate research insights into production systems and real-world applications.
Machine Learning Researcher, Audio
As a Machine Learning Researcher at Bland, the responsibilities include foundational research and development across core components of the voice stack such as speech-to-text, large language models, neural audio codecs, and text-to-speech. The role involves building and scaling next-generation text-to-speech (TTS) systems by designing and training large-scale TTS models for expressive, controllable, human-sounding output, developing neural audio codec-based TTS architectures for efficient and high-fidelity generation, improving prosody modeling, question inflection, emotional expression, and multi-speaker robustness, and optimizing real-time, low-latency inference in production. It also includes advancing speech-to-text modeling by building and fine-tuning large scale automatic speech recognition (ASR) systems robust to accents, noise, telephony artifacts, and code switching, leveraging self-supervised pretraining and large-scale weak supervision, and improving transcription accuracy for enterprise scenarios. Responsibilities extend to pioneering neural audio codecs by researching and implementing codecs achieving extreme compression with minimal perceptual loss, exploring discrete and continuous latent representations, and designing codec architectures for downstream generative modeling and controllable synthesis. Additionally, the role demands developing scalable training pipelines by curating and processing massive audio datasets, designing staged training curricula and data filtering strategies, and scaling training across distributed GPU clusters with focus on cost, throughput, and reliability. Finally, it involves running rigorous experiments, designing ablation studies to isolate architectural impacts, measuring improvements via objective metrics and perceptual evaluations, and validating ideas quickly with focused experiments.
Senior Research Engineer
As a Senior Research Engineer at Decagon, you will be responsible for building industry-leading conversational AI models, taking them from idea to production. Your role includes leading research and engineering efforts to improve core conversational capabilities in production such as instruction following, retrieval, memory, and long-horizon task completion. You will build and iterate on end-to-end models and pipelines focusing on quality, efficiency, and user experience. Collaboration with platform and product engineers to integrate new models into production systems is essential. Additionally, you are expected to break down ambiguous research ideas into clear, iterative milestones and roadmaps.
Member of Technical Staff - Alignment Lead
Drive the entire alignment stack, including instruction tuning, RLHF, and RLAIF, to push the model toward high factual accuracy and robust instruction following. Lead research efforts to design next-generation reward models and optimization objectives that improve human preference performance. Curate high-quality training data and design synthetic data pipelines addressing complex reasoning and behavioral gaps. Optimize large-scale reinforcement learning pipelines for stability and efficiency, enabling rapid model iteration cycles. Collaborate closely with pre-training and evaluation teams to create feedback loops that translate alignment research into generalizable model improvements.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
