Freelance AI Evaluation Engineer (Python/Full-Stack)
Create challenging coding test cases to push AI coding systems to their limits by reviewing and refining realistic coding tasks based on provided production codebases with realistic scope, requirements, and information sources. Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases. Craft challenges that are fair but hard, where the AI has all the context it needs, requiring complex reasoning with information scattered across files and external sources. Analyze AI failures to understand the model's struggles and strengths. Iterate based on feedback from expert QA reviewers who score work on seven quality criteria.
Freelance AI Evaluation Engineer (Python/Full-Stack)
Create challenging coding test cases that push AI coding systems to their limits by reviewing and refining realistic coding tasks based on provided production codebases. Write comprehensive functional tests that validate actual end-to-end behavior and edge cases, craft fair but hard challenges requiring complex reasoning and scattered information, analyze AI failures to understand model strengths and weaknesses, and iterate based on feedback from expert QA reviewers who score work on seven quality criteria.
Freelance AI Evaluation Engineer (Python/Full-Stack)
You will create challenging coding test cases to push AI coding systems to their limits by reviewing and refining realistic coding tasks based on provided production codebases with realistic scope, requirements, and information sources. You will write comprehensive functional tests that validate actual end-to-end behavior and edge cases, not just superficial checks. You are to craft "fair but hard" challenges where the AI has all the necessary context but must work through scattered information and complex reasoning. Additionally, you will analyze AI failures to understand what the model struggles with versus what it masters, and iterate your work based on feedback from expert QA reviewers who score your work on seven quality criteria.
Freelance AI Evaluation Engineer (Python/Full-Stack)
Create challenging coding test cases that push AI coding systems to their limits; review and refine realistic coding tasks based on provided production codebases with realistic scope, requirements, and information sources; write comprehensive functional tests that validate actual end-to-end behavior and edge-cases, not just superficial checks; craft "fair but hard" challenges where the AI has all the context it needs but has to work for it (information scattered across files and external sources, complex reasoning required); analyze AI failures to understand what the model struggles with versus what it masters; iterate based on feedback from expert QA reviewers who score work on seven quality criteria.
Freelance AI Evaluation Engineer (Python/Full-Stack)
Create challenging coding test cases that push AI coding systems to their limits by reviewing and refining realistic coding tasks based on provided production codebases with realistic scope, requirements, and information sources. Write comprehensive functional tests that validate actual end-to-end behavior and edge cases, not just superficial checks. Craft "fair but hard" challenges where the AI has all the context it needs but must work for it, with information scattered across files and external sources requiring complex reasoning. Analyze AI failures to understand what the model struggles with versus what it masters. Iterate based on feedback from expert QA reviewers who score work on seven quality criteria.
Enterprise Account Executive (New York City)
Debug and fix issues in the platform and ship PRs with fixes. Build internal tools and copilots powered by generative AI to enhance team capabilities. Rapidly prototype proof-of-concepts for customer use cases. Work collaboratively across Engineering, Product, and Solutions teams to unblock customers and advance AI adoption.
Associate Forward Deployed Engineer
As an Associate Forward Deployed Engineer, the responsibilities include working alongside senior engineers and directly with customers, who are leading AI labs, to solve pressing technical challenges. The role involves supporting the design and deployment of solutions that impact customer workflows and model performance, working across the stack, shipping quickly, and iterating based on real-world feedback. Tasks include partnering with AI labs and internal teams to identify needs and gather requirements, supporting solution design, working on custom asks from customers to prototype and deploy tailored solutions, tackling complex technical problems with support from senior engineers, taking increasing ownership from concept through deployment, rapidly prototyping, testing, and iterating on tools in response to real-time feedback, contributing to architectural discussions, helping establish best practices for reliability, scalability, and security, and documenting solutions to create technical playbooks for repeatable and scalable deployment processes.
Full Stack Engineer - AI-Driven Development
The Full Stack Engineer is responsible for configuring and developing "skills" for foundation models such as Claude and Gemini to automate content generation, marketing flows, and business processes. They develop and scale applications within the JavaScript/TypeScript ecosystem using React, integrating AI agents and automation. They create automated content generation systems for various platforms including Substack, blogs, social media, and personalized email marketing. They design and implement sophisticated workflow automations using platforms like Make.com or n8n to bridge data between systems such as EHR tools like Cerbo. The role includes managing PostgreSQL databases (Neon/AWS) and deployment within a Linux/Bash-heavy environment. Additionally, they help implement and refine voice AI solutions for call overflows and appointment scheduling, for example via GoHighLevel integrations. They manage and deploy applications on AWS (EC2) focusing on security and HIPAA compliance, configure and implement AI skills across various foundation models, collaborate with AI automation experts to design scalable efficient solutions, and establish coding standards and best practices for AI-driven development.
Deployed Engineer (Boston)
Co-architect and co-build production AI agents with customer engineering teams; own the technical win in pre-sales by designing POCs, answering deep technical questions, and guiding evaluations; help customers deploy and operate agent-based applications such as conversational agents, research agents, and multi-step workflows; advise customers post-sale on architecture, best practices, and roadmap-level decisions; run technical demos, trainings, and workshops for developer audiences; surface field feedback and contribute reusable patterns, cookbooks, and example code that scale across customers; occasionally contribute code upstream when it meaningfully improves customer outcomes.
Deployed Engineer (Southeast)
Co-architect and co-build production AI agents with customer engineering teams, own the technical win in pre-sales by designing POCs, answering deep technical questions, and guiding evaluations, help customers deploy and operate agent-based applications such as conversational agents, research agents, and multi-step workflows, advise customers post-sale on architecture, best practices, and roadmap-level decisions, run technical demos, trainings, and workshops for developer audiences, surface field feedback and contribute reusable patterns, cookbooks, and example code that scale across customers, and occasionally contribute code upstream when it meaningfully improves customer outcomes.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Need help with something? Here are our most frequently asked questions.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
