Googles new AI system can atriculate like humans 

ForumIAS announcing GS Foundation Program for UPSC CSE 2025-26 from 19 April. Click Here for more information.

ForumIAS Answer Writing Focus Group (AWFG) for Mains 2024 commencing from 24th June 2024. The Entrance Test for the program will be held on 28th April 2024 at 9 AM. To know more about the program visit: https://forumias.com/blog/awfg2024

Googles new AI system can atriculate like humans 

Context

In a major step towards its “AI first” dream, Google has developed a text-to-speech artificial intelligence (AI) system that will confuse you with its human-like articulation

Tacotron 2

The tech giant’s text-to-speech system called “Tacotron 2” delivers an AI-generated computer speech that almost matches with the voice of humans, technology

How the system works?

  • The system first creates a spectrogram of the text, a visual representation of how the speech should sound
  • That image is put through Google’s WaveNet algorithm, which uses the image and brings AI closer than ever to mimicking human speech. It can easily learn different voices and even generates artificial breaths

Mean Opinion Score (MOS)

“Our model achieves a mean opinion score (MOS) of 4.53 comparable to a MOS of 4.58 for professionally recorded speech,” the researchers were quoted as saying

What is MOS?

It is a numerical method of expressing voice and video quality

  • MOS gives a numerical indication of the perceived quality of the media received after being transmitted and eventually compressed using codecs
  • MOS is expressed in one number, from 1 to 5, 1 being the worst and 5 the best. MOS is quite subjective, as it is based figures that result from what is perceived by people during tests. However, there are software applications that measure MOS on networks

MOS values

The Mean Opinion Score Values

Taken in whole numbers, the numbers are quite easy to grade.

5 – Perfect. Like face-to-face conversation or radio reception

4 – Fair. Imperfections can be perceived, but sound still clear. This is (supposedly) the range for cell phones.

3 – Annoying

2 – Very annoying. Nearly impossible to communicate.

1 – Impossible to communicate

AI first

At Google I/O 2017 developers conference, the company’s CEO announced that the internet giant was shifting its focus from mobile-first to “AI first” and launched several products and features, including Google Lens, Smart Reply for Gmail and Google Assistant for iPhone

Print Friendly and PDF
Blog
Academy
Community