Nowadays, computer systems play a major role in our lives. They are used everywhere beginning with homes, offices, restaurants, gas stations, and so on. Nonetheless, for some, computers still represent the machine they will never know how to use. Communicating with a computer is done using a keyboard or a mouse, devices many people are not comfortable using. Speech recognition solves this problem and destroys the boundaries between humans and computers. Using a computer will be as easy as talking with your friend.
Unfortunately, scientists have discovered that implementing a perfect speech recognition system is no easy task. This report will present the principles and the major approaches to speech recognition systems along with some of their applications.
Overview of the Characteristics of Automatic Speech Recognition Systems
How can we evaluate a speech recognition system? Obviously describing it by good or bad isn’t enough since the performance of such a system may be outstanding in one application and poor in another. In fact, speech recognition systems are designed according to the application. Some of these variable characteristics are presented below.
Number of Words
The major characteristic of a speech recognition system is the number of words it can recognise. The question that comes to mind is how many words are enough so that the performance of a speech recognition system is acceptable. The answer depends on the application (6, p98). Some applications may require few words, like automated call-type recognition, others may require thousands, like data entry. However, increasing the number of words or the vocabulary of a speech recognition system increases its complexity and decreases its performance (probability of error is higher)(6, p.98). Systems with large vocabularies are also slower since more time is needed to search a word in a large vocabulary. Increasing the number of words isn’t enough because the speech recognition system is unable to differentiate words like ‘to’ and ‘two’ or ‘right’ and ‘write’ (6 ,p.98).
Use of Grammar
Using grammar, differentiating words like ‘to’ and ‘two’ or ‘right’ and ‘write’ is possible. Grammar is also used to speed up a speech recognition system by narrowing the range of the search (6,p.98). Grammar also increases the performance of a speech recognition system by eliminating inappropriate word sequencing. However, grammar doesn’t allow random dictation which is a problem for some applications (6, p.98).
Continuous vs. Discrete Speech
When speaking to each other, we don’t pause between words. In other words, we use continuous speech. However, for speech recognition systems, there is difficulty in dealing with continuous speech (6, p.98). The easy way out will be using discrete speech where we pause between words (6, p.100). With discrete speech input, the silent gap between words is used to determine the...