Looking for experts on Voice AI over telephony #153618
Replies: 2 comments
-
I just rename the default branch to main in each new repo right after creating it. It’s a bit annoying, but only takes a second and works fine for now. |
Beta Was this translation helpful? Give feedback.
-
Interesting problem space. If you want a conversational Voice AI that can originate/receive calls over SIP trunks (or a VoIP API)-without Twilio-you basically need four layers to play nicely together:
Below is a practical stack/architecture that is battle-tested and vendor-agnostic. I'll give you two variants: a minimal PoC and a production-ready baseline. 0) SIP Trunk Providers (US)If you're not using Twilio, common US options with good SIP support:
1) Telephony & Call ControlOption A - Self-hosted PBX/B2BUA (flexible, OSS)
Why: You own the signaling and can script call flows (IVRs, transfers, conferencing), inject media, and fork streams to your AI. Option B - Hosted programmable telephony (less ops)
2) Media I/O (the "real-time" bit)You want bidirectional, low-latency audio between the PSTN call and your AI:
Implementation choices:
3) Conversation EngineASR (streaming)
NLU / Orchestrator
TTS (fast + natural)
4) Glue, Ops, Compliance
Minimal PoC Stack (works in a weekend)
Call flow sketch:
Production-Ready Baseline
Concrete Tech Choices
Checklist of what to avoid
"What stack would you recommend?"
Happy to sketch sample code for the media bridge (GStreamer or |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Body
I am working on getting a Conversational Voice AI built which can use SIP trunking or VoIP API to initiate calls (not using Twilio as it will not operate in the US, using virtual PBX or as mentioned direct sip trunking to assigned phone numbers by local telephony providers). What kind of tech stack would you recommend to use?
Guidelines
Beta Was this translation helpful? Give feedback.
All reactions