2 votes

Real-time speech-to-speech translation

Has anyone used a free, offline, open-source, real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)? There are a few libraries that written that purportedly can do or help with local speech-to-speech:

I'm looking for a simple app that can listen for English, translate into Korean (and other languages), then perform speech synthesis on the translation. Although real-time would be great, a short delay would work.

RTranslator is awkward (couldn't get it to perform speech-to-speech using a single phone). 3PO sprouts errors like dandelions and requires an online connection.

Any suggestions?

1 comment

  1. creesch
    Link
    The combination of things you are asking for is a challenge. Certainly combining offline use with underpowered devices. Specifically, the under-powered device bit is making it a next to impossible...

    The combination of things you are asking for is a challenge. Certainly combining offline use with underpowered devices. Specifically, the under-powered device bit is making it a next to impossible ask, I think. As for accurate speech to text and accurate translation you do likely need to run two models, a good speech to text model supporting both languages. A LLM also trained reasonably well in both languages, ideally trained to do translation. There are other options, but the result there likely will be much more crude.

    As far as RTranslator goes, I was doing a quick google search myself and came across their repo. The online remark makes me think you used the v1.0 version of it, the v2.0 version does everything on your phone. According to their github anyway, but they also make it clear there are minimum hardware requirements for it to have any reasonable chance of working:

    I have optimized the AI models a lot to minimize RAM consumption and execution time, despite this however to be able to use the app without the risk of crashing you need a phone with at least 6GB of RAM, and to have a good enough execution time you need a phone with a fast enough CPU.

    Edit:

    I might not be entirely right, I also came across LibreTranslate which for translation uses a offline library called argos-translate which still uses neural machine translation but might have slightly lower hardware requirements? Not entirely sure about that last bit, but figured I'd throw it in there.