𝕏X (Twitter)

Rahul (@sairahul1)

MICROSOFT OPEN SOURCED A 7B PARAMETER MODEL THAT TRANSCRIBES 60 MINUTES OF AUDIO IN A SINGLE PASS and it's completely free VIBEVOICE ASR no chunking, no context loss, full speaker diarization baked in not just speech to text..not a basic wrapper who spoke, when they spoke, exactly what they said..all in one shot and it handles the hard stuff too..50+ languages, custom hotwords, long form audio that breaks every other tool the model doesn't know what "context window" means apparently Av...

Cargando tweet...

Ver publicación original →