Cartesia
Cartesia sells a real-time generative voice API to developers building voice agents, powered by state-space models (SSMs) that deliver sub-50ms latency and can run on-device without a cloud connection.
Evidence notes
Who uses it: Developers and enterprises integrating voice into applications such as customer service agents, debt collection, interview screening, voiceovers, and interactive characters; named enterprise customers include Quora, Cresta, Rasa, EliseAI, Replicant, Retell...[1]+4Reported
Product: Developers call the Sonic API (or run the model on-device); Sonic streams the first audio byte in under 45ms, supports advanced controllability (emotion, speed, voice prompting), and can be fine-tuned to custom voices.