AI Data Engineer Jobs

Discover the latest remote and onsite AI Data Engineer roles across top active AI companies. Updated hourly.

Join our AI community Interested in Hiring?

Hiring by

Check out 208 new AI Data Engineer opportunities posted on AI Chopping Block

View detail

Senior Data Engineer

New

Top rated

HackerOne

–

Full-time

–

Posted

Mar 26, 2026 14:10

The Senior Data Engineer at HackerOne is responsible for leading the end-to-end design and delivery of scalable, secure, and intelligent data products and solutions to support the company's transformation into an AI-first organization. This role involves partnering across business and engineering teams to identify opportunities for automation, integration, and system modernization, driving the architecture and execution of platform-level capabilities by leveraging AI and modern tooling to reduce manual effort, improve decision-making, and increase system resilience. The engineer will provide technical leadership to internal engineers and external development partners to ensure design quality, operational excellence, and long-term maintainability, shape and contribute to incident and on-call response strategies, playbooks, and processes to build systems that fail gracefully and recover quickly, mentor other engineers and advocate for technical excellence, and promote a culture of innovation and continuous improvement. Additionally, the role includes championing effective change management to ensure systems are successfully launched, adopted, understood, and evolved.

₹3,672,000 – ₹4,131,000

Undisclosed

YEAR

(INR)

Pune, India

Maybe global

Onsite

View detail

Staff Data Warehouse Engineer

New

Top rated

Together AI

–

Full-time

–

Posted

Mar 12, 2026 8:14

As an AI Infrastructure Engineer at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You participate in on-call rotation (Pagerduty) to respond to production incidents. You build and run infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users. You build monitoring systems to ensure the highest quality service for customers. You design and implement operational processes such as deployments and upgrades. You debug production issues across all services and levels of the stack. You identify improvements for the product architecture from the reliability, performance and availability perspectives. You plan the growth of Together AI's infrastructure.

$190,000 – $270,000

Undisclosed

YEAR

(USD)

San Francisco

Maybe global

Onsite

View detail

Senior AI Data Pipeline Engineer

New

Top rated

42dot

–

Full-time

–

Posted

Feb 9, 2026 15:53

Design and build high-performance, scalable data pipelines to support diverse AI and Machine Learning initiatives across the organization. Architect and implement multi-region data infrastructure to ensure global data availability and seamless synchronization. Develop flexible pipeline architectures that allow for complex branching and logic isolation to support multiple concurrent AI projects. Optimize large-scale data processing workloads using Databricks and Spark to maximize throughput and minimize processing costs. Maintain and evolve the containerized data environment on Kubernetes, ensuring robust and reliable execution of data workloads. Collaborate with AI researchers and platform teams to streamline the flow of high-quality data into training and evaluation pipelines.

Undisclosed

()

Pangyo, South Korea

Maybe global

Remote

View detail

Member of Technical Staff - Data Ingestion Engineer

New

Top rated

Reflection

–

Full-time

–

Posted

Jan 15, 2026 0:34

The role involves building and operating large-scale data ingestion systems for pre-training, including web crawling, extraction, and dataset delivery. The engineer will run experiments to evaluate crawling strategies, extraction methods, and ingestion tradeoffs. They will analyze ingested data to identify gaps, redundancy, and areas for improvement. Responsibilities also include building ingestion pipelines that scale reliably across large data campaigns, developing specialized crawlers for high-priority data sources, reviewing code, debugging production issues, and continuously improving the ingestion infrastructure. The role requires close collaboration with pre-training and data quality teams and working directly with researchers to link data collection to model performance.

Undisclosed

()

San Francisco, United States

Maybe global

Onsite

View detail

Software Engineer, Distributed Data Systems

New

Top rated

Exa

–

Full-time

–

Posted

Dec 19, 2025 14:24

As a Data Engineer, you will architect and build the data infrastructure that powers all company operations, including crawling billions of pages, training embedding models, and serving real-time search. You will have autonomy in designing systems that scale to hundreds of petabytes. Responsibilities include designing lakehouse architectures, building and operating large-scale distributed data processing pipelines, creating streaming pipelines for real-time indexing, architecting data layers for embedding training infrastructure, and scaling deployments to handle analytical queries across petabytes of data.

$150,000 – $300,000

Undisclosed

YEAR

(USD)

San Francisco, United States

Maybe global

Onsite

View detail

Data Engineer – Spark Specialist

New

Top rated

Dataiku

–

Full-time

–

Posted

Dec 12, 2025 17:55

Help users discover and master the Dataiku platform through user training, office hours, demos, and ongoing consultative support. Analyse and investigate various kinds of data and machine learning applications across industries and use cases. Provide strategic input to the customer and account teams that help make customers successful. Scope and co-develop production-level data science projects with customers. Mentor and help educate data scientists and other customer team members to aid in career development and growth.

Undisclosed

()

Maybe global

Hybrid

View detail

Data Engineer

New

Top rated

Artisan AI

–

Full-time

–

Posted

Dec 4, 2025 1:10

The Data Engineer will design, build, and maintain data pipelines, manage data ingestion, and develop reliable data models to support AI and ML workflows. The role also involves close collaboration with ML and product teams to ensure clean, structured, and high-quality data delivery for analytics and product features.

Undisclosed

()

Maybe global

On-site

View detail

AI Pilot Vibe Coding Assistant (Freelance)

New

Top rated

Mindrift

–

Part-time

Full-time

–

Posted

Dec 1, 2025 6:00

AI Pilot Vibe Coding Assistants collaborate with AI-driven systems to generate, refine, and submit accurate, well-structured outputs based on complex prompts. They handle coding, automation, data processing, troubleshooting technical issues, and improving AI output quality across diverse domains.

Undisclosed

HOUR

(USD)

Maybe global

Remote Solely

View detail

Data Engineer

New

Top rated

Replit

–

Full-time

–

Posted

Nov 27, 2025 6:01

The Data Engineer will design, build, and maintain scalable data pipelines to support analytics and data-driven decision making at Replit. They will collaborate across teams to deliver ETL/ELT workflows, ensure data quality, and build unified data models for in-depth analysis.

Undisclosed

YEAR

(USD)

Maybe global

Hybrid

View detail

Data Operations Manager

New

Top rated

Greenlite AI

–

Full-time

–

Posted

Nov 25, 2025 6:35

Build and scale data and financial operations to support deployment and growth of AI agents for major institutional clients. Take ownership of billing, collections, data infrastructure, dashboards, and cross-functional operations to provide actionable, real-time visibility to business leaders.

Undisclosed

YEAR

(USD)

Maybe global

On-site

Want to see more AI Data Engineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.

Join our community

(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Have questions about roles, locations, or requirements for AI Data Engineer jobs?

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What does an AI Data Engineer do?","answer":"AI Data Engineers build and manage data pipelines specifically for AI and machine learning models. They design architectures that process diverse data types such as text, images, and videos for model consumption. Their daily work includes implementing data validation systems, ensuring quality, and integrating large-scale datasets from multiple sources. They create real-time data workflows, handle vector databases like FAISS or Milvus, and optimize performance of AI data infrastructure. Using tools like Python, SQL, Apache Spark and Airflow, they collaborate with data scientists and ML engineers to transform raw data into formats that support model training and deployment."},{"question":"What skills are required for AI Data Engineer jobs?","answer":"Strong programming skills in Python and SQL form the foundation for AI Data Engineer roles. Proficiency with data engineering frameworks like Apache Spark, Airflow, and Ray is essential for building robust pipelines. Experience with cloud platforms (AWS, GCP, Azure) and vector databases enables handling of AI-specific data needs. Skills in data quality assurance, monitoring, and error handling ensure reliable AI systems. Engineers should understand embedding techniques for unstructured data processing and have experience with ETL processes at scale. Soft skills like cross-functional collaboration are valuable as these roles bridge technical teams with AI scientists and business stakeholders."},{"question":"What qualifications are needed for AI Data Engineer jobs?","answer":"Most AI Data Engineer positions require a bachelor's degree in computer science, data engineering, or related technical fields, with many employers preferring master's degrees for senior roles. Hands-on experience building data pipelines for machine learning applications is crucial. Employers look for demonstrated expertise with cloud data services like Redshift, BigQuery or Snowflake, and familiarity with MLOps practices. Knowledge of data preprocessing techniques for unstructured data (text, images, videos) sets successful candidates apart. Professional certifications in cloud platforms or data technologies can strengthen qualifications, especially when combined with proven experience integrating large-scale datasets for AI workflows."},{"question":"What is the salary range for AI Data Engineer jobs?","answer":"Compensation for AI Data Engineers varies based on several key factors. Location significantly impacts pay, with tech hubs like San Francisco and New York offering higher salaries than smaller markets. Experience level creates substantial differences, with senior engineers commanding significantly more than entry-level positions. Specialized skills in emerging AI tools, vector databases, and specific cloud platforms can increase earning potential. Company size also matters—large tech companies and well-funded AI startups often pay premium rates. The specialized nature of preparing data for AI applications typically positions these roles at higher compensation levels than traditional data engineering positions with similar years of experience."},{"question":"How long does it take to get hired as an AI Data Engineer?","answer":"The hiring timeline for AI Data Engineers typically spans 4-8 weeks from application to offer. The process usually includes an initial resume screening, followed by a technical phone interview covering Python, SQL, and data pipeline concepts. Candidates then face 1-3 rounds of technical interviews focusing on data engineering problems, system design for AI workflows, and coding exercises. Some companies add take-home assignments demonstrating pipeline building for AI data. Final rounds often include discussions with potential team members and hiring managers. Specialized skills in AI data preprocessing and experience with vector databases can accelerate the process, especially for candidates with proven experience in similar roles."},{"question":"Are AI Data Engineer jobs in demand?","answer":"AI Data Engineer positions show strong demand as organizations build infrastructure for AI initiatives. This specialized role bridges traditional data engineering and AI needs, with job postings appearing at major institutions like Stanford and companies like OpenAI. The role is gaining recognition as essential for AI implementation success, particularly as companies scale their machine learning operations. Demand stems from the unique requirements of AI data pipelines, which differ significantly from traditional analytics infrastructure. Organizations need engineers who understand the specific data preprocessing needs of machine learning models and can build robust pipelines for handling diverse data types including text, images, and videos."},{"question":"What is the difference between AI Data Engineer and Data Engineer?","answer":"While both roles build data pipelines, AI Data Engineers specifically focus on preparing data for machine learning and AI systems rather than business analytics. They work extensively with unstructured data (text, images, videos), implementing specialized preprocessing techniques that traditional Data Engineers rarely handle. AI Data Engineers commonly use vector databases like FAISS and embedding libraries that aren't typical in standard data engineering. They must understand model training data requirements and build infrastructure supporting model deployment. Traditional Data Engineers concentrate on structured data flows, data warehousing, and analytics support, while AI Data Engineers create pipelines optimized for machine learning with features like data versioning, lineage tracking, and real-time AI-ready data delivery."}]