Discussion about this post

User's avatar
Meng Li's avatar

Overall, Moshi's advantages are clear: compared to other voice dialogue bots, Moshi is more human-like, with great immediacy, quick responses, and rich expressiveness.

However, unlike GPT-4o, Moshi lacks the ability to handle multiple languages.

Currently, Moshi's core generation capabilities are not as strong as Llama3 8B, but it can potentially be used with RAG or fine-tuned for specific tasks.

In summary, Moshi has shown me the true potential of natural communication between AI and humans.

Supporting more voices and languages may just be a matter of time, and its potential as a coach, companion, or in various role-playing applications is very exciting.

Expand full comment
1 more comment...

No posts