6 votes

Whispers of AI’s modular future

2 comments

  1. skybrian
    Link
    From the article: […]

    From the article:

    Despite being one of the more sophisticated programs ever to run on my laptop, Whisper.cpp is also one of the simplest. […] It was written in five days by Georgi Gerganov, a Bulgarian programmer who, by his own admission, knows next to nothing about speech recognition. Gerganov adapted it from a program called Whisper, released in September by OpenAI, the same organization behind ChatGPT and dall-e. Whisper transcribes speech in more than ninety languages. In some of them, the software is capable of superhuman performance—that is, it can actually parse what somebody’s saying better than a human can.

    […]

    A friend of mine, a filmmaker and software developer, has written a thin wrapper around the tool that transcribes all of the audio and video files in a documentary project to make it easier for him to find excerpts from interviews. Others have built programs that transcribe Twitch streams and YouTube videos, or that work as private voice assistants on their phones. A group of coders is trying to teach the tool to annotate who’s speaking. Gerganov, who developed Whisper.cpp, has recently made a Web-based version, so that users don’t have to download anything.

    4 votes
  2. FlippantGod
    Link
    Oh hey I was working with that a while back. I can attest to good code quality.

    Oh hey I was working with that a while back. I can attest to good code quality.

    2 votes