GLM-TTS: The 3-Second Voice Cloner That Outperforms Commercial Systems (But Has a Contraction Problem)
Z.ai’s GLM-TTS combines zero-shot voice cloning, multi-reward RL emotion control, and bilingual streaming synthesis. The open-source model beats closed alternatives on accuracy, but real-world testing reveals sharp edges.