Hello Gabriel from Kyutai here, maybe it's related to the way we chunk the text? Can you post an issue on github with the extact text and voice? I'll take a look.
Gabriel from Kyutai here, we do support outputting wav to stdout. We don't support reading text from stdin but that should be easy enough. Feel free to drop a pull request!