Location
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
Date posted
April 10, 2026
Job type
Full-time
Experience level
Mid Level
Summary this job with AI
Highlight
Highlight

Job Description

Your Charter

  • Data at Scale: Own the pipelines and storage systems that feed petabyte-scale multimodal datasets into model training.

  • Sustainable Platforms: Build tooling and systems that are automated and efficient, enabling processing at scale and handling many small heterogeneous datasets.

Required Skillsets

  • Data Engineering: Knowledge of Python ETL pipelines and supporting infrastructure, data formats, and storage systems at scale.

  • ML Data Ops: Experience managing datasets, annotations, and data versioning for model training.

  • Basic ML Knowledge: Solid grasp of ML fundamentals is essential to collaborate effectively with researchers and make sound data platform decisions.

  • Agentic Engineering: Skilled at writing high-quality specifications for AI agents, while maintaining effective human review of AI-generated work.

Responsibilities

  • Design, automate, maintain, and optimize Python ETL pipelines (Spark/Ray) for large-scale multimodal data.

  • Build and maintain data cataloging, lineage, quality tooling, integrity verification, access controls, and lifecycle management systems.

  • Provide guidance, internal tools, and documentation to colleagues on data best practices.

  • Serve as a custodian of the company’s datasets, ensuring overall data health, quality, and discoverability.

Challenges You'll Tackle

  • Implement high-performance, multimodal data pipelines capable of processing petabyte-scale datasets on 10,000s of CPUs and 100s of GPUs.

  • Evolve data formats, storage, and processing to keep pace with cutting-edge AI advancements, while maintaining backward compatibility.

  • Scale data infrastructure to handle the next order of magnitude in growth.

  • At the same time, ensure the data platform flexible to rapidly handle many small heterogeneous datasets and ad hoc analytics queries.

Traits of the Ideal Candidate

  • High agency and ownership: proactively picks up new work according to priority, manages their own backlog, and escalates early when priorities are unclear or deadlines are at risk.

  • Takes responsibility for validating inputs end-to-end: spot-checks data, understands upstream preprocessing, and speaks up when something doesn't add up.

  • Takes responsibility for ensuring outputs are correct and handed over: actively seeks sign-off from downstream consumers, communicates caveats, and ensures relevant stakeholders are aware of changes and breaking impacts.

  • Cares about continuously improving pipelines, tooling, and processes so that each iteration makes the next one faster, more reliable, and easier for the team.

  • Comfortable with rapid, pragmatic solutions when needed, but committed to high-quality, long-term solutions.

What we offer (compensation & benefits)

  • Competitive salary and equity

  • Private health coverage

  • Pension contribution (UK, Canada, US)

  • Fully-distributed, async-first culture

  • Hardware setup of your choice

  • Stipends for phone, internet, and meals

In our team, we approach our work with the dedication similar to Olympic athletes. Anticipate occasional late nights and weekends dedicated to our mission. We understand this level of commitment may not suit everyone, and we openly communicate this expectation.

If you're motivated by deeply technical problems, a seemingly never-ending uphill battle and the opportunity to build (and own) a generational technology company, we can give you what you're looking for.

All business roles at Moonvalley are hybrid positions by default, with some fully remote depending on the job scope. We meet a few times every year, usually in London, UK or North America (LA, Toronto) as a company.

If you're excited about the opportunity to work on cutting-edge AI technology and help shape the future of media and entertainment, we encourage you to apply. We look forward to hearing from you!

The statements contained in this job description reflect general details as necessary to describe the principal functions of this job, the level of knowledge and skill typically required and the scope of responsibility. It should not be considered an all-inclusive listing of work requirements. Individuals may perform other duties as assigned, including work in other functional areas to cover absences, to equalize peak work periods, or to otherwise balance organizational work

Moonvalley AI is proud to be an equal opportunity employer. We are committed to providing accommodations. If you require accommodation, we will work with you to meet your needs.

Please be assured we'll treat any information you share with us with the utmost care, only use your information for recruitment purposes and will never sell it to other companies for marketing purposes. Please review our privacy policy and job applicant privacy policy located here for further information.

Apply now
Moonvalley is hiring a Member of Technical Staff (Data): World Models. Apply through The AI Chopping Block and and make the next move in your career!
Apply now
Companies size
101-200
employees
Founded in
2023
Headquaters
London, United Kingdom
Country
United Kingdom
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

Senior Data Intelligence Engineer

Full-time
Data Engineer
No items found.

Member of Technical Staff (Data): World Models

Full-time
Data Engineer
US.svg
United States

Tech Lead Manager, Data Infrastructure

Full-time
Data Engineer
US.svg
United States

Engineer, Supercomputing & Distributed Systems

Full-time
Data Engineer
IN.svg
India

Senior Data Engineer

Full-time
Data Engineer
FR.svg
France

Data Engineer - Foundational

Full-time
Data Engineer
Open Modal