We believe in the magic of combining technology and storytelling to create rich characters and delightful experiences.
Try out our preview here:
sesame.com/voicedemo
We believe in the magic of combining technology and storytelling to create rich characters and delightful experiences.
Try out our preview here:
sesame.com/voicedemo
Lastly...
We can do with less screens in our lives.
We’re building comfortable, all-day wearable eyewear, for the most natural way for a personal companion to see, hear and respond.
Doing this right is tough, but we’ve made solid strides - I’ll be sharing more on this soon
We will be open-sourcing the contextual TTS base model (w/o this character's voice fine-tuning)
This will let anyone build voice experiences locally w/o external API’s.
This is something I would have loved for previous demos and so am personally passionate about.
The demo you can try uses our contextual TTS, using both conversation text and audio to deliver natural voice generation.
Here is a real example of this in action (that you can try), where Maya's delivery starts matching the context after a few lines.
We're focused on making voice feel real, natural and delightful - to become the most intuitive interface for collaborating with AI
It's not just about words, but about pacing, expressivity & cues. We’re working on full end-to-end duplex models to capture these humanlike dynamics
Excited to share a peek of what I’ve been working on
We at Sesame believe voice is key to unlocking a future where computers are lifelike
Here’s an early preview you can try! 👇
We’ll be open sourcing a model, and yes…
we’re building hardware! 🧵