AI Infrastructure Engineer Jobs

Discover the latest remote and onsite AI Infrastructure Engineer roles across top active AI companies. Updated hourly.

Check out 16 new AI Infrastructure Engineer opportunities posted on AI Chopping Block

Mixed-Signal IC Layout Design Engineer

New
Top rated
Tenstorrent
Full-time
Full-time
Posted

Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, P&R, STA, signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and ML-based techniques, collaborating closely with verification, extraction, timing, DFT, and EDA vendors.

$100,000 – $500,000
Undisclosed
YEAR

(USD)

Santa Clara or Austin or Fort Collins, United States
Maybe global
Hybrid

AI Factory, Value Engineer

New
Top rated
Armada
Full-time
Full-time
Posted

Responsibilities include translating business requirements into requirements for AI/ML models, preparing data to train and evaluate AI/ML/DL models, building AI/ML/DL models using state-of-the-art algorithms especially transformers, testing and evaluating models, benchmarking quality, publishing models and datasets, deploying models in production by containerizing them, working with customers and internal employees to refine model quality, establishing continuous learning pipelines with online or transfer learning, and building and deploying containerized applications on cloud or on-premise environments.

$154,560 – $193,200
Undisclosed
YEAR

(USD)

United States
Maybe global
Remote

SOC Architect

New
Top rated
OpenAI
Full-time
Full-time
Posted

Define the architecture and technical roadmap for custom SoCs targeted for edge applications. Drive system-level tradeoff analysis across compute, memory, interconnect, power, thermal, and cost constraints. Architect energy-efficient ML compute subsystems optimized for inference workloads and real-world deployment environments. Collaborate with internal hardware, software, systems, and product teams to align architecture with platform needs. Partner with external silicon vendors, IP providers, and manufacturing partners to execute development plans. Lead hardware/software co-design efforts to maximize performance per watt and end-to-end system efficiency. Guide implementation teams through microarchitecture, RTL development, validation, and bring-up phases. Operate effectively in agile development environments and help teams deliver against aggressive schedules and milestones.

$266,000 – $445,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite

Director of Customer Engineering

New
Top rated
Tenstorrent
Full-time
Full-time
Posted

Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, P&R, STA, signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and QoR. Optimize EDA tools and custom CAD flows using data-driven and ML-based techniques, in close collaboration with verification, extraction, timing, DFT, and EDA vendors.

$100,000 – $500,000
Undisclosed
YEAR

(USD)

Santa Clara or Austin or Fort Collins, United States
Maybe global
Hybrid

Defense / Edge Tech Lead

New
Top rated
Deepgram
Full-time
Full-time
Posted

As the Defense / Edge Tech Lead, you will own the technical direction for deploying Deepgram's speech-to-text (STT) and text-to-speech (TTS) models to edge and embedded environments. Your responsibilities include leading the technical strategy for edge deployment, defining the architecture for on-device, on-premises, and air-gapped inference across diverse hardware targets. You will optimize models for edge and embedded platforms through quantization, pruning, distillation, and runtime optimization to meet latency, memory, and power constraints. You will partner with hardware vendors like Qualcomm and Motorola for SDK integration, performance benchmarking, and joint go-to-market efforts. Supporting defense customer requirements through AWS NatSec partnerships by translating mission requirements into engineering deliverables is also part of your role. You will design and build edge runtime infrastructure such as model packaging, deployment pipelines, OTA update mechanisms, and telemetry for devices in low or no connectivity environments. Deployments must be hardened for security-sensitive environments with features like secure boot chains, encrypted model storage, tamper detection, and audit logging. You will benchmark and validate performance across hardware platforms, establishing test suites for latency, accuracy, power consumption, and resource utilization. Collaboration with Research and Engine teams to influence model architectures toward edge-friendly designs is expected. Furthermore, you provide technical leadership to cross-functional teams on defense and edge projects, set engineering standards, review designs, and mentor engineers on systems and optimization practices.

$185,000 – $245,000
Undisclosed
YEAR

(USD)

United States
Maybe global
Remote

Performance Modeling Lead

New
Top rated
OpenAI
Full-time
Full-time
Posted

Build and own a performance modeling framework and toolchain to evaluate AI systems across multiple levels of abstraction. Analyze and quantify architectural tradeoffs across compute, memory, networking, storage, and system topology. Develop performance models to guide decisions on scale-up vs. scale-out architectures, interconnect and network design, and memory hierarchy and system balance. Translate modeling outputs into clear recommendations for internal teams and external hardware vendors. Influence reference designs and vendor roadmaps through data-driven insights. Partner closely with machine learning, systems, and hardware teams to understand workload characteristics and requirements. Lead and grow a small team of 2–3 engineers, setting technical direction and maintaining high standards for modeling rigor. Continuously improve modeling fidelity by validating against real system behavior and measurements.

$342,000 – $555,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Hybrid

AI Performance Simulation Architect

New
Top rated
Tenstorrent
Full-time
Full-time
Posted

Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, placement and routing (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and ML-based techniques, in close collaboration with verification, extraction, timing, design for test (DFT), and EDA vendors.

$100,000 – $500,000
Undisclosed
YEAR

(USD)

Santa Clara or Austin or Fort Collins, United States
Maybe global
Hybrid

Regional Sales Lead, Singapore

New
Top rated
Tenstorrent
Full-time
Full-time
Posted

Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, placement and routing (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and machine learning-based techniques, working closely with internal teams such as verification, extraction, timing, Design for Test (DFT), and electronic design automation (EDA) vendors.

$100,000 – $500,000
Undisclosed
YEAR

(USD)

Santa Clara or Austin or Fort Collins, United States
Maybe global
Remote

Customer Support Engineer (GPU Cluster)

New
Top rated
Together AI
Full-time
Full-time
Posted

Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines such as SGLang- or vLLM-style systems including kernel backends, speculative decoding, quantization, etc. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate reinforcement learning (RL) and post-training pipelines like RLHF, RLAIF, GRPO, DPO-style methods, and reward modeling where most of the cost is inference, optimizing algorithms and systems jointly. Make RL and post-training workloads more efficient with inference-aware training loops and use these pipelines to train, evaluate, and iterate on frontier models on the inference stack. Co-design algorithms and infrastructure to couple objectives, rollout collection, and evaluation tightly with efficient inference and identify bottlenecks across training, inference, data pipelines, and user-facing layers. Run ablation and scale-up experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights back into model, RL, and system design. Own critical production-scale systems by profiling, debugging, and optimizing inference and post-training services under real workloads, driving roadmap items that require engine modification such as changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to validate improvements rigorously. Provide technical leadership by setting technical direction for cross-team efforts at the inference, RL, and post-training intersection and mentoring other engineers and researchers in full-stack ML systems work and performance engineering.

$200,000 – $280,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite

AI/ML Physical Design Flow Engineer

New
Top rated
Tenstorrent
Full-time
Full-time
Posted

Lead and contribute to cross-functional efforts solving complex physical design challenges across IPs, projects, and advanced technology nodes. Develop and enhance RTL-to-GDS methodologies, including floorplanning, synthesis, place and route (P&R), static timing analysis (STA), signoff, and assembly. Architect and deploy AI/ML-driven solutions in production physical design flows to improve engineering efficiency, turnaround time, and quality of results (QoR). Optimize EDA tools and custom CAD flows using data-driven and machine learning-based techniques, collaborating closely with verification, extraction, timing, design for test (DFT), and electronic design automation (EDA) vendors.

$100,000 – $500,000
Undisclosed
YEAR

(USD)

Austin or Fort Collins or Santa Clara, United States
Maybe global
Hybrid

Want to see more AI Infrastructure Engineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Have questions about roles, locations, or requirements for AI Infrastructure Engineer jobs?

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What does a AI Infrastructure Engineer do?","answer":"AI Infrastructure Engineers design and build the systems that power machine learning workloads. They optimize performance by resolving bottlenecks, implement scaling solutions through load balancing and redundancy, and deploy cloud infrastructure specifically for AI applications. These specialists build fault-tolerant systems for serving large language models, maintain continuous integration pipelines, and collaborate with AI teams to translate research needs into production-ready infrastructure."},{"question":"What skills are required for AI Infrastructure Engineer?","answer":"Key skills for this role include proficiency with cloud platforms (AWS SageMaker, Azure ML, Vertex AI), infrastructure as code tools like Terraform, and containerization technologies such as Docker and Kubernetes. Strong programming abilities in Python, Go or C++ are essential, with CUDA knowledge for GPU optimization. Experience with monitoring tools (Prometheus, Grafana), distributed systems, deep learning frameworks, and Linux/UNIX environments is highly valued in candidates."},{"question":"What qualifications are needed for AI Infrastructure Engineer role?","answer":"Employers typically require a bachelor's degree in Computer Science, AI, Machine Learning, or related technical field. Most positions demand 4+ years of experience in cloud infrastructure, large-scale systems, or software engineering with an infrastructure focus. Practical expertise in cloud computing, Linux administration, network architecture, and container technologies is essential. Specialized knowledge in GPU programming, distributed systems, and LLM serving capabilities strengthens applications considerably."},{"question":"What is the salary range for AI Infrastructure Engineer job?","answer":"The research provided doesn't contain specific salary information for AI Infrastructure Engineers. Compensation typically varies based on location, experience level, company size, and the specific technical skills required. As this role combines specialized AI knowledge with infrastructure expertise, salaries generally reflect the high demand for professionals who can effectively build and optimize systems for machine learning workloads at scale."},{"question":"How long does it take to get hired as a AI Infrastructure Engineer?","answer":"The research doesn't provide specific hiring timeline information. The hiring process length varies by company and often includes technical assessments of cloud architecture knowledge, infrastructure as code experience, and machine learning operations skills. Given the specialized nature of AI infrastructure roles and their typical requirement of 4+ years of relevant experience, candidates should expect thorough evaluation of their technical capabilities and problem-solving abilities."},{"question":"Are AI Infrastructure Engineer job in demand?","answer":"Yes, AI Infrastructure Engineer positions show strong demand signals. Major companies like Accenture, Scale AI, and Zoom are actively recruiting for these specialized roles. The increasing deployment of large language models and AI applications across industries creates consistent need for professionals who can build optimized infrastructure. The specialized skill intersection of cloud platforms, containerization, GPU optimization, and machine learning operations makes qualified candidates particularly valuable in today's job market."}]