Articles repérés par Hervé Le Crosnier

Je prend ici des notes sur mes lectures. Les citations proviennent des articles cités.

  • Google’s Translatotron can translate speech in the speaker’s voice
    https://www.engadget.com/2019/05/15/google-translatotron-direct-speech-translation/?guccounter=1
    https://o.aolcdn.com/images/dims?thumbnail=1200%2C630&quality=80&image_uri=https%3A%2F%2Fo.aolcdn.com%

    Speaking another language may be getting easier. Google is showing off Translatotron, a first-of-its-kind translation model that can directly convert speech from one language into another while maintaining a speaker’s voice and cadence. The tool forgoes the usual step of translating speech to text and back to speech, which can often lead to errors along the way. Instead, the end-to-end technique directly translates a speaker’s voice into another language. The company is hoping the development will open up future developments using the direct translation model.

    According to Google, Translatotron uses a sequence-to-sequence network model that takes a voice input, processes it as a spectrogram — a visual representation of frequencies — and generates a new spectrogram in a target language. The result is a much faster translation with less likelihood of something getting lost along the way. The tool also works with an optional speaker encoder component, which works to maintain a speaker’s voice. The translated speech is still synthesized and sounds a bit robotic, but can effectively maintain some elements of a speaker’s voice. You can listen to samples of Translatotron’s attempts to maintain a speaker’s voice as it completes translations on Google Research’s GitHub page. Some are certainly better than others, but it’s a start.

    Model architecture of Translatotron

    Google has been fine-tuning its translations in recent months. Last year, the company introduced accents in Google Translate that can speak a variety of languages in region-based pronunciations and added more langauges to its real-time translation feature. Earlier this year, Google Assistant got an “interpreter mode” for smart displays and speakers that can between 26 languages.

    #Intelligence_artificielle #Traduction_automatique #Google