What happens when someone who has been running ViciDial clusters since 2008 rethinks an AI telephony architecture from the ground up? You don't get another SaaS product. You get a Call Control Layer that will fundamentally change the future of work in call centers.
We deployed LiveKit and AI voice agents in production call centers in early 2024 — months before OpenAI and LiveKit announced their "Advanced Voice" partnership in October 2024. What OpenAI markets as groundbreaking today, we didn't just deploy earlier — we fundamentally advanced it with our self-developed Call Control Layer: warm transfer, multi-tenant isolation, and 7-emotion TTS existed in this combination nowhere else in the world — not even at OpenAI.
GoFonIA is not a startup that just "discovered" AI telephony as a trend. It is the result of 17 years of operational call center experience — compressed into an architecture that knows everything a call center needs before the first call comes in.
Setup and operation of production ViciDial installations. Predictive dialing, agent scripting, campaign management — the full spectrum of classic call center technology on an open-source foundation.
Operation of multi-carrier setups with SIP trunking via Telekom, Plusnet, dus.net, voip2gsm. Asterisk tuning at the kernel level. Development of proprietary monitoring and reporting tools. The knowledge of where call center technology hits its limits — and why.
First experiments with language models in telephony contexts. The realization: no existing framework can do what a real call center needs. Neither the US cloud providers nor the European alternatives.
Start of in-house development. The core question: How do you build a telephony controller that doesn't just "answer calls and respond", but maps the complete logic of a call center — including warm transfer, hold queues, tenant isolation, and SIP orchestration?
Completion of the single-room architecture with 5-phase transfer logic, 7-emotion TTS, multi-tenant DID routing, and watchdog engine. Production deployment with first customers.
The heart of GoFonIA is not an AI model. It is a self-developed control layer that operates between the telephone network and the AI — orchestrating the entire call logic. This layer exists nowhere else in the DACH region.
Because standard telephony frameworks were not built to keep four participants in one room and switch audio streams between them in real time. Because conventional AI telephony only knows "call → answer" — but not "agent introduces, colleague listens, music plays, caller waits, everyone in the same room". Because tenant isolation at the DID level, runtime tool registry, and SIP orchestration with fallback strategies are not provided in any SaaS toolkit in the world.
The Control Layer operates on five levels simultaneously:
DID-based tenant recognition during SIP handshake. Dynamic participant creation via outbound SIP. BYE management at connection end.
Single-room architecture: all participants in the same room. Subscription matrix controls who hears whom. Phase-controlled audio switching without connection drop.
5-phase state machine with timeouts and fallbacks. Tenant-specific MOH (Music-on-Hold, 8 GB royalty-free library). Agent briefing to target colleague. Return on non-availability.
Per tenant: own API keys, prompts, voices, emotion matrix, knowledge base, tools, SIP credentials. Redis-based session engine. Zero cross-tenant leakage.
Autonomous monitoring of all active rooms. Timeout detection, room deletion via API, Redis state cleanup, email transcript delivery. Fully automated in < 2 seconds.
Every call passes through a decision chain in milliseconds, operating on seven independent levels in parallel:
SIP participant in the room
LLM + 7 emotion voices
WebRTC hold queue
Outbound SIP in the room
GoFonIA doesn't use generic TTS. Each tenant receives a calibratable emotion matrix with seven dimensional voice profiles — individually configurable per tenant, per campaign, per call type. The voice doesn't just react semantically — it reacts paraverbally.
Baseline. Factual, information-dense. For status queries and fact communication.
Warm, approachable, open. For greetings, small talk, service calls.
Understanding, patient, de-escalating. For complaints and sensitive topics.
Formal, precise, politely distanced. For B2B, banking, insurance, government context.
Driving, solution-oriented, energetic. For sales calls and conversion-oriented campaigns.
Quiet, deep, trust-building. For first-level support, hold queues, technical hotlines.
Clear, direct, boundary-setting. For collections, compliance checks, escalation.
The emotion matrix operates on two paraverbal axes: speech rate (0.6×–2.4×) and voice timbre (frequency shift ±18%). Plus context-dependent pause logic: the agent knows when silence is the more powerful tool than speech.
No AWS. No Google Cloud. No Azure. GoFonIA runs on dedicated Hetzner root servers in Frankfurt am Main and Nuremberg — virtualized via Proxmox, orchestrated in isolated LXC containers.
Per tenant: isolated LXC container with its own Redis store, own API keys, own prompt versions, and own SIP registration. No shared memory. No cross-tenant data flow. Each container is a self-contained telephony system that can be individually backed up, migrated, and scaled.
For sensitive sectors we use exclusively European AI — Mistral AI (Paris) and Infomaniak (Switzerland) for LLM, TTS and STT. On request fully on-premise: Our entire stack — LLM, TTS, STT and Call Control Layer — runs on your own hardware. Designed for banks, insurance companies, government agencies, law firms and healthcare. Operation is fully GDPR-compliant on Hetzner Online GmbH servers with a data processing agreement under Art. 28 GDPR.
Every byte GoFonIA processes stays on servers of Hetzner Online GmbH in Frankfurt am Main and Nuremberg. The AI models used — Mistral AI (Paris) and Infomaniak (Switzerland) — have their legal domicile in the European Union. US-free by default. For sensitive industries (banking, insurance, government, legal, healthcare) we deliver the entire stack on-premise on customer hardware.
No US Cloud Act. No third-country transfer. No silent data leakage. §203 StGB compliant. GDPR-audited. Unique in the DACH region.
The GoFonIA Call Control Layer is not designed to replace human agents. It is designed to redefine the division of labor between humans and machines.
Routine calls — appointment scheduling, status queries, simple FAQs — are handled completely autonomously by the AI agent. Complex cases — complaints, negotiations, consulting — are handed over to a human colleague with full context and structured briefing. The colleague takes over the call with zero ramp-up time, because the agent has already clarified and documented everything.
The result: call centers don't get smaller. They get better. Repetitive work disappears. Demanding work remains — and is relieved by perfect preparation. This is not automation. This is augmentation.
We don't build AI that replaces people. We build a control layer that orchestrates 17 years of call center knowledge in real time — so that the people in the call center can finally do what they're actually there for: solve complex problems. Not fill out forms.