Full-stack Developer (Full-Time/Intern) - SH 全栈工程师 (全职/实习) - 上海
As a Full-Stack Engineer at Flowith, you will be responsible for independently or collaboratively leading the full-stack development of Flowith's core modules crossing front-end and back-end boundaries to deliver highly available and scalable system code. You will deeply integrate advanced AI algorithms and complex models into the product flow to create intelligent interactive experiences, work closely with product managers, designers, and AI engineers in a creative environment to implement innovative AI concepts, automate deployments and manage continuous integration on mainstream cloud infrastructure while monitoring and optimizing system performance and resource usage. Additionally, you will participate in the design evolution of the core architecture, conduct in-depth code reviews, and help accumulate technical components and best practices to elevate the engineering standards of the team.
Full Stack Software Engineer - OpenAI for Finance
The responsibilities include owning the end-to-end development lifecycle for new enterprise products, collaborating closely with product, design, and external customers to understand problems and implement effective solutions, and working with the research team to improve the next generation of models.
Senior Product Designer, Mobile
Own the observability and lifecycle management of AI features across the organization. Build tools and infrastructure to enable teams to develop, monitor, and optimize LLM-powered features. Design and implement closed-loop evaluation pipelines that automatically validate prompt changes. Develop comprehensive metrics and dashboards to track LLM usage, including cost per feature, token patterns, and latency. Create systems that tie user feedback to specific prompts and LLM calls. Establish best practices and processes for the full lifecycle of prompts, including development, testing, deployment, and monitoring. Collaborate with engineering teams across the organization to ensure they have the tools and visibility needed to build high-quality AI features.
Software Engineer, Inference Platform
Own inference deployments end-to-end including initial configuration, performance tuning, production SLA maintenance, and incident response; drive measurable improvements in throughput, time-to-first-token (TTFT), and cost-per-token across diverse model families and customer workload patterns; build and operate KV cache and scheduling infrastructure to maximize utilization across concurrent requests; implement and validate disaggregated prefill/decode pipelines and Kubernetes-based orchestration supporting them at scale; profile and resolve bottlenecks at compute, memory, and communication layers and instrument deployments for end-to-end observability; partner with customers to translate model architectures, access patterns, and latency requirements into deployment configurations and platform improvements; contribute to the inference platform architecture and roadmap focusing on reducing deployment complexity, improving hardware utilization, and expanding support for new model classes and accelerators; participate in an on-call rotation to maintain production reliability and SLA commitments.
Field Events Marketing Manager
Debug and fix issues in the platform and ship pull requests with fixes. Build internal tools and copilots powered by generative AI to enhance the team. Rapidly prototype proof-of-concepts for customer use cases. Collaborate across Engineering, Product, and Solutions teams to unblock customers and advance AI adoption.
Software Engineer, Agent
Design and deliver production-grade AI agents that are highly performant, reliable, and intuitive, central to driving revenue and used in production environments across various industries such as finance, healthcare, and commerce. Have complete ownership and autonomy over the Agent Development Life Cycle (ADLC) from initial pilot through deployment and continuous iteration, including building, tuning, and evolving AI agents while defining ADLC best practices. Partner with large enterprises and startups to understand business challenges and build AI agents that transform operations at scale. Build and evolve Sierra's core platform by surfacing unmet needs, prototyping new tools and features, and collaborating with research, product, and platform teams to shape the future of AI agent development and Sierra's products.
Staff Product Designer, Go Enterprise
Own the observability and lifecycle management of AI features across the organization. Build tools and infrastructure to enable teams to develop, monitor, and optimize LLM-powered features. Design and implement closed-loop evaluation pipelines that automatically validate prompt changes. Develop comprehensive metrics and dashboards to track LLM usage including cost per feature, token patterns, and latency. Create systems that tie user feedback to specific prompts and LLM calls. Establish best practices and processes for the full lifecycle of prompts including development, testing, deployment, and monitoring. Collaborate with engineering teams across the organization to ensure they have the tools and visibility needed to build high-quality AI features.
Senior Software Engineer, Managed AI - AI Platform
Lead the design and implementation of core AI services including resilient fault-tolerant queues, model catalogs, and scheduling mechanisms optimized for cost and performance. Architect and scale infrastructure capable of handling millions of API requests per second. Implement robust monitoring and alerting to ensure system health and 24/7 availability. Collaborate closely with product management, business strategy, and other engineering teams to define the AI platform roadmap. Influence the long-term vision and architectural decisions of the platform. Contribute to open-source AI frameworks and participate in the AI community. Prototype and iterate on emerging technologies and new features.
Engineering Manager, Managed AI
As an Engineering Manager on the Managed AI team at Crusoe, you will lead and scale a team of engineers building next-generation platform infrastructure for Large Language Models (LLMs). Responsibilities include guiding the team through the design and implementation of highly scalable, fault-tolerant infrastructure; leading a team of software engineers; defining and executing the AI roadmap; cultivating a high-performance engineering culture; overseeing architecture and development of core AI services such as fault-tolerant task queues and model management systems; ensuring delivery of scalable systems capable of handling millions of API requests per second; delivering an AI platform capable of handling varied AI loads from training to agentic execution infrastructure; working cross-functionally with product, infrastructure, and GTM stakeholders; representing engineering in strategic discussions; promoting knowledge sharing, mentorship, and evolving engineering processes. This role requires in-office presence in San Francisco or Sunnyvale, CA.
Senior Staff Software Engineer, Model LifeCycle
The Senior Staff Engineer for the Model LifeCycle team at Crusoe is responsible for building a comprehensive managed platform for the entire application development lifecycle with a focus on Machine Learning models including Large Language Models (LLMs). Responsibilities include managing fine-tuning systems for large foundation models such as SFT, PEFT, LoRA, and adapters with multi-node orchestration, checkpointing, failure recovery, and cost-efficient scaling. They implement and maintain end-to-end training pipelines for LLMs, distillation and reinforcement learning pipelines including preference optimization, policy optimization, and reward modeling, as well as manage agent execution infrastructure. They also manage dataset, model, and experiment management tasks including versioning, lineage, evaluation, and reproducible fine-tuning at scale. Additionally, they work closely with product, business, and platform teams to shape core abstractions and APIs, influence architectural decisions around training runtimes, scheduling, storage, and model lifecycle management, contribute to and engage with the open-source LLM ecosystem, and take ownership in designing and building core systems from first principles.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Need help with something? Here are our most frequently asked questions.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.
