About Your Role:
At any given moment, businesses running on Aircall are fielding calls their teams can't handle alone. Outside business hours, during peak volume, across languages, around the clock. Our AI Voice Agent is what answers. It's not a simple IVR or a canned response bot. It's a real time AI voice agent that understands context, talks to customers naturally, executes actions mid-conversation, and knows when to hand off to a human.
As a Software Engineer on this team, you'd work at the intersection of real time systems, production LLM applications, and voice infrastructure. The problems are hard in a specific way: milliseconds matter when someone's on the line, latency shapes whether a conversation feels natural or robotic, and the quality of your work is immediately audible to thousands of customers.
This isn't AI prototyping. It's production AI at scale, and one of the fastest moving product areas at Aircall.
What you'll work on:
-
Real-time Speech Pipeline: You'll work on live audio systems: buffering, streaming, latency optimization, and the integration points with speech providers. When latency creeps up, you investigate and fix it. When a new provider looks promising, you evaluate and potentially integrate it.
-
The Conversation Intelligence: Conversation quality depends on how well we manage the LLM layer. Prompt construction, context management, function calling, instruction ordering. You'll build and improve the systems that make conversations feel natural: dynamic knowledge injection, action execution with branching logic, and the dialogue management that turns a phone call into something useful.
-
Actions and Integrations: Businesses configure AI Voice Agent to execute tasks during calls. Look up account data, create tickets, check order status. You'll work on the action framework: configurable API calls, success/failure branching, auth management and the runtime execution engine.
-
Knowledge and Memory: A voice agent is only as good as what it knows. You'll work on how agents ingest, store, and retrieve knowledge at runtime. There's also the question of memory: how agents retain information across a conversation, learn from prior interactions, and use context to give better answers over time.
-
Agent Lifecycle and Configuration: Customers need to create, configure, test, and deploy voice agents easily and intuitively. You'll work with our designers to translate cutting edge advanced AI concepts and features into smooth, reliable and easy to use product experiences.
-
Evaluations and Quality: Voice AI quality is measurable. You'll help build evaluation frameworks for each model involved, post-call analytics, call quality metrics, and Datadog instrumentation. You'll also be part of the on-call rotation.
What you'll bring:
-
You have experience building production backend systems ideally in Python or Typescript (all levels welcome, 5+ years for senior roles).
-
You write async code naturally, understand how to design for performance and reliability, and have shipped systems that real users depend on.
-
You understand what makes real-time systems different from request/response API’s. You can talk about the tradeoffs around buffering, streaming, connection management, and latency.
-
You've worked with LLMs or AI/ML systems in production. You understand concepts like prompt engineering, context management, function calling, and what it takes to make AI behavior reliable for customers.
-
You know evaluation matters and you've built or used tooling around it.
-
Event driven architectures, message queues, caching layers, and async task processing feel natural to you.
-
You've debugged distributed failures and designed systems that handle partial failures gracefully.
-
Testing strategy, observability, API design, and code review are things you do thoughtfully. You write code that other engineers can understand and extend.
-
You communicate clearly. You can explain a complex technical problem to someone without your context, write a design doc that prevents misalignment, and ask good questions before diving into implementation.
-
You are familiar with AI coding tools and already embedded them into your day to day engineering processes.
Nice to have:
-
Experience with voice, audio, or telephony systems
-
Prior work with STT or TTS providers and models
-
Familiarity with LLM evaluation frameworks and quality metrics for AI outputs
-
AWS experience, particularly DynamoDB, Lambda, AppSync, or SQS
-
GraphQL API design experience
-
Kubernetes and infrastructure as code (Terraform, Helm)




