Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Question: does anyone recommend a TTS that automatically recognizes emotion from the text it self?


Gradium (https://gradium.ai/), a commercial company offshoot of Kyutai (open source lab), are focusing on emotion (both being able to recognise emotion and also understanding what emotion to use depending on context). I don't think any of their public existing models already does that, but they demoed it pretty impressively at the ai-Pulse conference.


Chatterbox does something like that. For example, if the input is

"so and so," he <verb>

and the verb is not just "said", but "chuckled", or "whispered", or "said shakily", the output is modified accordingly, or if there's an indication that it's a woman speaking it may pitch up during the quotation. It also tries to guess emotive content from textual content, such if a passage reads angry it may try to make it sound angry. That's more hit-and-miss, but when it hits, it hits really well. A very common failure case is, imagine someone is trying to psych themselves up and they say internally "come on, Steve, stand up and keep going", it'll read it in a deeper voice like it was being spoken by a WW2 sergeant to a soldier.


Thank you!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: