Sunday, February 17, 2019

How voice to text works?

How voice to text works?


How voice to text works?


Introduction



You daily use “OK Google” or “Hey Siri”, a number of time while using these technologies you think that, this is very simple to use. But this is not a simple technology. This technology is called as the Voice Recognization. In the beginning voice, dictation command was only for specific machines for specific words just in order to give instructions to machine without touching them. There was Limited command in like “ON”, “OFF”. During that time the machine stores the digital command of the voice in itself. Hence, the voice input gets executed by the digitize command data present inside the machine. The machine performs the particular instruction stored in the computer. At that time data has been stored word by word in every machine. Nowadays whenever the user gives the input it matches as the best for the machine there is no right and wrong. Now you are thinking that every language has thousands of words “Is it possible to store every word?” Yes by using cloud computing. Like every language, there are Billions of words with thousands of words more than one meaning. In order to recognize the word by its standard machine recognize the word by its reference. You have been also using a digital assistant or speak option in Windows where you can type by speaking words. In iPhone “hey Siri” revolutionized the entire voice to text system. Nowadays Google has come with “OK Google” which supports languages other than English also. Now the question arises “How thousand of languages with millions of command and with billions of words are to be done using voice to text command?”. So here the new term arises called “Cloud Computing”. If you have internet connection connected with your phone then your phone takes your voice data as an input to the cloud computing server where the entire data is stored in terms of questions, phrases etc. Now this service works as real-time which means that the way you speak to the device, the device sends that data to the server simultaneously server is able to predict next word you are going to speak by knowing the reference of your words. The reference is taken of the  “What was the last word you had been said?” then it will guess will be the next word of yours and once you end your statement of your voice. The data will be transferred and has been added to the server you get the particular output. So basically we can conclude that voice to text feature focus context of your statement example if you search for the “What is the net worth of Jeff Bezos?” and next use “What is his current age?”( you haven't take the name of services but still you about a person which were none other than the Jeff Bezos) you will get the result.

Read Also




Working of voice to text


Whenever you speak you create vibrations and this year vibrations are nothing but the analog waves these analog waves letter converted into digital data which is understandable for the computer by using analog to digital converter.  Exact measurements of the sound waves At frequent intervals are taken in order to digitize sound. Later the system filters the undesirable
noise from the sound in order to remove it.  As we are aware that humans never speak every time at the same speed.  Now the signal of the sound is divided into different sections 1/100th or more of a second. Later these divisions are matched by the machine’s program in order to
Understand the “Phonemes”. The last step the program matches the Phonemes by taking the reference of the words around them. Later the machine uses the phonemes to the statistical model and scans them with the words stored in the machine. Once it is understood what is the given statement either it is a text request or a computer command it will be executed.


Google Duplex


Google Duplex is a technology in smartphones where you will interact with the machine in order to accomplish the task. This is an ultimate version of the “OK Google” where the machine will interact in the human voice. In this, the machine will talk by taking pauses and using phrases like “AMM”, “HMM” etc.  For example, If you want to book a table for two people tonight at a particular restaurant then only you need to do is open the Google Duplex tell them to register a table for two at “XYZ” restaurant then the machine will request a call to that particular restaurant and first of all tell them that they are interacting with a machine and after booking a table for two Duplex will inform you.




0 comments: