Location
San Francisco, United States
San Francisco, United States
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
Date posted
June 10, 2026
Job type
Full-time
Experience level
Summary this job with AI
Highlight
Highlight

Job Description

The Role

We're seeking a Data Engineer to transform large-scale geospatial datasets into structured, reliable, and accessible formats that power Mach9's ML and product pipelines. You'll work with high-volume data sources — laser scan point clouds, imagery, and a long tail of geospatial formats — and own the systems that get them ingested, standardized, stored, and made available for training, perception, and production use in a consistent and efficient way.

This role sits at the front of everything we do: our models are only as good as the data feeding them, and you'll be the one making that data trustworthy at scale.

Responsibilities

  • Develop and maintain scalable, reproducible workflows for ingesting and processing large volumes of point cloud, imagery, and geospatial data.

  • Convert datasets from various sensor providers into Mach9's standardized internal formats.

  • Build CI/CD pipelines and automated checks that guarantee the correctness and consistency of data pipelines, including regression detection on dataset processing.

  • Optimize processing performance, query speed, and storage efficiency across large geospatial datasets.

  • Work closely with the customer success team to efficiently resolve issues and unblock customer projects.

    • Build and maintain agentic harness for automated dataset triage and code patching. Automatically propose or apply fixes, and escalate when human judgment is needed.

  • Work closely with ML and product teams to make data readily usable for training, inference and visualization.

  • Work closely with customers and data-provider partners to facilitate data integration (with occasional travels).

  • Puzzle-hunting: work with data formats with sparse or missing documentation.

Requirements

  • Strong software development, problem-solving, and debugging skills, with hands-on experience building production systems in Python.

  • Solid foundation in distributed systems and parallel computing.

  • Comfort operating with ambiguity — able to dig into undocumented or messy data formats, reverse-engineer how they work, and make steady progress without a clear spec.

  • Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching.

  • Strong communication and collaboration skills, with the ability to work across ML, product, and customer-facing teams.

  • Bachelor's degree in Computer Science, Engineering, or equivalent experience.

Bonus qualifications

  • Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching.

  • Understanding of geospatial data formats (e.g., LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (e.g., GDAL, PDAL, untwine, laz-perf).

  • Expertise designing and managing data schemas and storage systems for geospatial data (e.g., Postgres/PostGIS, AWS S3).

  • Experience with large-scale data processing frameworks and cloud platforms (e.g., Spark, AWS Batch).

  • Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms).

  • Experience building data versioning, lineage, or artifact-tracking systems.

  • Experience operating data pipelines that feed ML training and inference.

  • Familiar with C++.

Apply now
Mach9 is hiring a Data Engineer. Apply through The AI Chopping Block and and make the next move in your career!
Apply now
Companies size
11-50
employees
Founded in
2021
Headquaters
San Francisco, CA, United States
Country
United States
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

Data Engineer

Full-time
Data Engineer
US.svg
United States

Research Engineers, Data

Full-time
Data Engineer
No items found.

Analytics Engineer

Full-time
Data Engineer
US.svg
United States
CA.svg
Canada

Junior Software Engineer

Full-time
Data Engineer
US.svg
United States

Data Platform Engineer

Full-time
Data Engineer
US.svg
United States

Member of Technical Staff, Data (Bay Area, Remote)

Full-time
Data Engineer
Open Modal