Whisper is a general-purpose speech recognition model.
It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
whisper 作者:Alec Radford、Jong Wook Kim