27 votes

Spotify (with OpenAI) is going to clone podcasters’ voices — and translate them to other languages

20 comments

  1. [10]
    drannex
    Link
    Ignoring the obvious immediate moral quandaries, this is an insanely cool application of technology. Pure science fiction coming right at you.

    Ignoring the obvious immediate moral quandaries, this is an insanely cool application of technology. Pure science fiction coming right at you.

    31 votes
    1. [9]
      Jordan117
      Link Parent
      What's the problem? It sounds like it's opt-in. I'd be interested in seeing how it handles laughter, cross-talk, and other non-verbal cues. From what I've seen, AI voice systems like ElevenLabs...

      Ignoring the obvious immediate moral quandaries

      What's the problem? It sounds like it's opt-in.

      I'd be interested in seeing how it handles laughter, cross-talk, and other non-verbal cues. From what I've seen, AI voice systems like ElevenLabs are spooky-good at intonation and realism, but get outside its comfort zone of ad copy and audiobook narration and it can get pretty weird-sounding.

      17 votes
      1. [8]
        yosayoran
        Link Parent
        The problem is that it takes away work from voice artist and translators. I honestly kinda doubt this will be a very profitable endeavour for the vast majority of podcasts. I mean, even if the...

        The problem is that it takes away work from voice artist and translators.

        I honestly kinda doubt this will be a very profitable endeavour for the vast majority of podcasts. I mean, even if the translation and voice over are perfect it's just adding to an already flooded market.

        For example, even in my small native language (Hebrew, less than 10 mil native speakers) you can find quality podcasts about pretty much anything.

        11 votes
        1. [3]
          sparksbet
          Link Parent
          There are probably languages with even fewer native speakers than Hebrew that might benefit from something like this -- but such languages very rarely have machine learning models trained on them...

          There are probably languages with even fewer native speakers than Hebrew that might benefit from something like this -- but such languages very rarely have machine learning models trained on them and when they do the results tend to be a lot worse. So it doesn't actually help those who might genuinely struggle to find content in their native language.

          11 votes
          1. [2]
            yosayoran
            Link Parent
            Yup that's also a great point. Language models in Hebrew suck ass, which is kind of ironic because a lot of the development centers for these companies are in Israel. A friend of mine worked on a...

            Yup that's also a great point.

            Language models in Hebrew suck ass, which is kind of ironic because a lot of the development centers for these companies are in Israel.

            A friend of mine worked on a Google product, I think their keyboard but can't remember (it was like 5 years ago), and they had to finish development on like 7 languages before he was allowed to start planning on Hebrew

            3 votes
            1. sparksbet
              Link Parent
              Oh yeah I work as a data scientist on language data and my boss is from Israel so Hebrew's always one of the ones we test on. Hebrew's in exactly the right middle-ground where it's got enough...

              Oh yeah I work as a data scientist on language data and my boss is from Israel so Hebrew's always one of the ones we test on. Hebrew's in exactly the right middle-ground where it's got enough speakers and development that it gets attempted, but not enough that the model gets enough data and effort to be properly good. Plus the RTL factor is always fun to see frontend stuff deal with.

              4 votes
        2. [3]
          burkaman
          Link Parent
          Agreed in general, but I don't think I've ever heard of a podcast being translated before, and they've been around for a while. This specific application is not taking work away from anyone -...

          The problem is that it takes away work from voice artist and translators.

          Agreed in general, but I don't think I've ever heard of a podcast being translated before, and they've been around for a while. This specific application is not taking work away from anyone - podcasts have been around for 20 years and nobody yet has been willing to pay voice actors and translators to do this work.

          7 votes
          1. yosayoran
            Link Parent
            I certainly have seen it at least a couple of times One, from Hebrew to English Two, from English to Spanish This is for informational history type of podcasts, not the "just talking" or talk-show...

            I certainly have seen it at least a couple of times

            One, from Hebrew to English
            Two, from English to Spanish

            This is for informational history type of podcasts, not the "just talking" or talk-show types.

            Honestly I very doubtful AI could create tone and vibe of a conversation in a way that would keep the listening experience of those types of podcasts interesting, but I suppose time will tell.

            5 votes
          2. boredop
            Link Parent
            My company (WNYC) did one called La Brega, with versions in English and Spanish released simultaneously.

            My company (WNYC) did one called La Brega, with versions in English and Spanish released simultaneously.

            1 vote
        3. skybrian
          Link Parent
          Yes, for the average podcast, many nobody would care, but I imagine some famous podcasters might gain a worldwide audience somehow? Could that happen for the best Hebrew-language podcasters? Or...

          Yes, for the average podcast, many nobody would care, but I imagine some famous podcasters might gain a worldwide audience somehow? Could that happen for the best Hebrew-language podcasters? Or maybe not the best, but the ones that are more interesting to an international audience?

          I’m reminded of how big Hollywood movies are released worldwide and that’s a significant part of their revenue. It doesn’t mean local movies don’t exist, but it changes things. Other countries have entertainment industries that do well internationally, too.

          For current events it might matter more, since AI could do a quicker translation. I don’t think there’s as much interest now, but at one time there might have been a lot of interest in a Ukrainian podcast?

          I imagine there will be talented producers who know how to get the best results out of the tools, as we see in music. Possibly, if done well enough, it might be seen as “more authentic,” closer to what the original speaker would have said. I could also see these algorithms being messed with creatively for weird special effects, sort of like auto tune.

  2. [9]
    thefilmslayer
    Link
    Not sure how I feel about this. Of course there's the moral/legal aspect of it, but something I don't think they've really considered is localization. It's the same problem people translating...

    Not sure how I feel about this. Of course there's the moral/legal aspect of it, but something I don't think they've really considered is localization. It's the same problem people translating stuff like anime have; what do you do when idioms/sayings/etc don't translate into other languages because of a cultural filter?

    20 votes
    1. [5]
      redwall_hp
      (edited )
      Link Parent
      In the case of anime, leave it alone and put a translation note so we can learn something new. I despise localization. Part of the appeal of anime is it's a separate cultural context, instead of...

      In the case of anime, leave it alone and put a translation note so we can learn something new. I despise localization. Part of the appeal of anime is it's a separate cultural context, instead of more of the same Hollywood machine. It's also super cringy when translators inject random slang and memes. I also dislike it when translators omit things like honorifics, which is stripping out a layer of social metadata that is interesting and sometimes funny, simply because some people don't care to learn.

      Machine translation is incredibly bad with Japanese though, in both directions. It can get you the gist of something that's not too idiomatic, but it wouldn't be pleasant to listen to...and Google Translate famously insults people. I'd give a service like this a day or two before there are news articles about podcast hosts being incredibly offensive in translated versions.

      17 votes
      1. [3]
        thefilmslayer
        Link Parent
        I agree 100%. The best translations are the ones where the language is translated verbatim, and they have a handy explanation below the subtitles in a different font. They don't change the...

        I agree 100%. The best translations are the ones where the language is translated verbatim, and they have a handy explanation below the subtitles in a different font. They don't change the meaning, but instead explain what it means to people not familiar with the culture.

        6 votes
        1. smiles134
          (edited )
          Link Parent
          I strongly disagree with this, but it also depends on what's being translated. For example, literal/verbatim translations of Homer into English suffer because those Greek epics are written in...

          I strongly disagree with this, but it also depends on what's being translated. For example, literal/verbatim translations of Homer into English suffer because those Greek epics are written in dactylic hexameter, but in English it would completely lose the meter or rhythm, and wouldn't make much sense since word order doesn't matter nearly as much in Ancient Greek as it does in English. Translations like Robert Fitzgerald's, which captures the essence/spirit of the original and recasts it in a verse much more familiar to his target audience (i.e., modern English speakers), are far more successful and readable.

          Of course there are entire fields of study devoted to this conversation, but I do think there's something lost in literal word-for-word translations.

          16 votes
        2. redwall_hp
          Link Parent
          That sort of setup is very useful for comedies, since jokes can be complex and not only hinge on idioms or cultural differences, but may even be multilingual puns. Even if something isn't...

          That sort of setup is very useful for comedies, since jokes can be complex and not only hinge on idioms or cultural differences, but may even be multilingual puns. Even if something isn't immediately funny in a translation, I'd rather see the original joke and an explanation than some hamfisted replacement.

          3 votes
      2. lou
        (edited )
        Link Parent
        I am the one person who loves localization and prefers anime and manga to flow in the target language. Footnotes are informative, but they are no fun. I don't need to know that the characters are...

        I am the one person who loves localization and prefers anime and manga to flow in the target language. Footnotes are informative, but they are no fun.

        I don't need to know that the characters are eating a bean cake invented by the Samurai in the Edo period. I wanna follow a story, and the mini-Wikipedia on the subtitles breaks immersion.

        The Japanese watch it for the story as well, and without subtitles, so localization helps the source material to be experienced as it should -- an engaging story, not a list of nitpicking about Japanese culture.

        I understand I'm the only one, but, well, here I am!

        4 votes
    2. [3]
      stu2b50
      Link Parent
      What's the issue? It's presented as a creator tool, so that if you use the platform you can use this tool to make translated versions of your work in different languages. If you don't consent to...

      Of course there's the moral/legal aspect of it

      What's the issue? It's presented as a creator tool, so that if you use the platform you can use this tool to make translated versions of your work in different languages. If you don't consent to that, you presumably... don't use the tool.

      something I don't think they've really considered is localization

      Why is that a problem? Like you said, this is present in translation at large, and you just deal with it as a consumer of translated works. Does this prevent people from translating japanese in anime? Certainly not.

      I sometimes use Youtube's translated CC - it has a lot of issues, as information is not only lost in translation but in speech-to-text, but it's what's available. I understand that and its faults, but it is nonetheless useful. You don't need to let perfect be the enemy of good.

      5 votes
      1. [2]
        thefilmslayer
        Link Parent
        I quite literally stated the issue with localization in my previous post. You appear to have completely misunderstood what I wrote. Nowhere did I say it prevents people translating anime, but...

        I quite literally stated the issue with localization in my previous post. You appear to have completely misunderstood what I wrote. Nowhere did I say it prevents people translating anime, but often the meaning of the translation is different because some things don't exist in other cultures and have to be substituted. It's not as simple as just running stuff through a machine translator.

        5 votes
        1. stu2b50
          Link Parent
          I think you’re missing my point. Yeah, it’ll translate some things poorly. It is what it is. As machine translation improves it’ll gradually become better but otherwise you just deal with it. For...

          I think you’re missing my point. Yeah, it’ll translate some things poorly. It is what it is. As machine translation improves it’ll gradually become better but otherwise you just deal with it. For specific translations, it’ll depend on what machine translator you use whether it tries to localize or not. That’s something that can be controlled, although it’ll never be perfect.

          It is better than nothing. Just like how machine translation is used today. But more accessible now that we can synthesize fairly pleasing voices.

          2 votes
  3. EarlyWords
    Link
    As both a podcaster and audiobook narrator, the future sure is coming at me fast. At least I can count on being unique a little longer by depending on my character voices and method acting. I...

    As both a podcaster and audiobook narrator, the future sure is coming at me fast. At least I can count on being unique a little longer by depending on my character voices and method acting. I figure those kind of deep emotional contexts will be the most difficult to reproduce.

    5 votes