Azure AI Speech Service: Breaking Language Barriers with Video Translation

Azure AI Speech Services has been evolving since its launch, offering users the ability to create voice-enabled, multilingual generative AI applications with rapid transcription and natural-sounding voices. It enables text-to-speech conversion and vice versa, translates videos into multiple languages, supports real-time interactions using APIs, and much more, as detailed on the here.

In May of this year, Microsoft announced two major updates to the Azure AI Speech Translation product suite: Video Translation and an enhanced Real-time Speech Translation API.

In this blog, we will focus on the Video Translation service. If you’re interested in the real-time Speech APIs, you can check out this impressive project by Aymen Furter : “An Interactive Text-to-Podcast Experience with GPT-4’s Real-Time API.”

Video translation unlocks business values for a wide range of business scenarios with the authorized video content such as :

  • TV shows, movies & documentary: film studios and production companies can translate movies and TV shows for international distribution, reaching a broader audience and maximizing revenue potential.
  • Education & training materials video: educational institutions and/or training programs can translate and dub learning video materials to provide accurate and timely information to audiences worldwide.
  • Advertising & marketing video: businesses can localize their advertising and marketing videos to resonate with target audiences in different markets, enhancing brand awareness and customer engagement.

Transform Video Into Different Language:

For this task, i will use “What is GitHub ”youtube video

  • GitHub Introduction Video : We will try to convert this in Finnish language

https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FpBy1zgt0XPc%3Fstart%3D25%26feature%3Doembed%26start%3D25&display_name=YouTube&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DpBy1zgt0XPc&image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FpBy1zgt0XPc%2Fhqdefault.jpg&key=a19fcc184b9711e1b4764040d3dc5c07&type=text%2Fhtml&schema=youtube

Lets get started on Azure portal :

  1. Goto Azure AI services and choose Speech Service from the catalog. Create a new speech service resource.

2. After creating the service , we need Speech Studio to translate video.Speech Studio is a set of UI-based tools for building and integrating features from Azure AI Speech service in your applications. You create projects in Speech Studio by using a no-code approach, and then reference those assets in your applications by using the Speech SDK, the Speech CLI, or the REST APIs.

3. Select the Video Translation tile from the service catalog.

4. Create a new project and select a voice type. By default, you’ll get a prebuilt neural voice, but you also have the option to use a custom voice by providing a personal voice sample.

After choosing the voice type, upload the video and assign a unique project name. Next, select the original language of the video and the target language for translation.

There’s also an option to add a subtitle file.

5. After creating the project, your video will be ready for processing. Once completed, you’ll see the translated video within your project. The top bar displays information such as the processing status, source and target languages, voice type, and the date and time of creation and last modification.

6. Just below the top panel , you have options to change settings like voice settings or new languages. Also you have a button to download the video.

7. In the main panel, On the left hand side you get the translated video that you can play to check out the results as well as you can see the orginal video.

8. On the right hand side , you get the text from original video in the source language and the translated

9. This feature is especially helpful because you can review the translation and make any necessary corrections; your changes will be applied automatically.

10. To download the results, use the Download button on the top panel. This will provide you with three files: the translated video in .mp4 format, the subtitle file, and metadata in a JSON file.

Result

  • GitHub video translated to Finnish:

https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FNvjqcK8CyZg%3Ffeature%3Doembed&display_name=YouTube&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DNvjqcK8CyZg&image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FNvjqcK8CyZg%2Fhqdefault.jpg&key=a19fcc184b9711e1b4764040d3dc5c07&type=text%2Fhtml&schema=youtube

Summary

Azure AI Video Translation offers a seamless solution to overcome language barriers. It’s easy to upload videos, convert them into different languages, and download them directly through the Azure portal. For those who prefer a programmatic approach, Azure also provides REST APIs — check out the documentation here.

This exciting service has immense potential to enhance accessibility across various industries. Educational institutions and training programs can localize video materials to support learners globally, while businesses can adapt marketing videos to connect with diverse markets, strengthening customer engagement and brand presence.

Please note that this service is currently in public preview.

About the Author:

Pause
Tajinder Singh

My name is Tajinder Singh, and most of my friends and colleagues call me TJ. Currently, I am working at GitHub as a Solutions Engineer based in the beautiful city of Zurich, Switzerland. If you have any feedback regarding this blog, you can reach out to me via LinkedIn or the comments section.

Linkedin Profile: https://www.linkedin.com/in/tajinder-singh-74740115b/

Reference:

Singh, T (2024). Azure AI Speech Service: Breaking Language Barriers with Video Translation. Available at: Azure AI Speech Service: Breaking Language Barriers with Video Translation | by Tajinder Singh | Nov, 2024 | Medium [Accessed: 18th October 2024].

Share this on...

Rate this Post:

Share: