Asr - Automatic Speech Recognition

Only available on StudyMode
  • Topic: Speech recognition, Speech synthesis, Speech processing
  • Pages : 3 (571 words )
  • Download(s) : 26
  • Published : May 13, 2013
Open Document
Text Preview
ASR - Automatic Speech Recognition

Automatic speech recognition transformations of acoustic micro structure of speech signal into its implicit phonetic macro-structure. In other words, a speech recognition system is a speech-to-text conversion wherein the output of the system displays text corresponding to the recognized speech.

Typology of ASR systems

Several ASR systems can be developed, depending on:

• Speaker-dependent vs. independent

• Language constraints:

o Isolated word recognition

o Connected word recognition

o Continuous speech recognition

o Keyword spotting

Approaches to ASR

Pattern recognition approach

Pattern training and pattern comparison are the two essential steps in this approach. First feature measurement is done through Filter Bnk, LPC, DFT. Then pattern training is done by creation of a reference pattern derived from an averaging technique. Next step is comparing speech patterns with a local distance measure and a global time alignment procedure (DTW). Similarity scores are used to decide which the best reference pattern is.

Acoustic-Phonetic approach

This is also known as rule-based approach. Here we use knowledge of phonetics and linguistics to guide search process. Usually some rules are defined expressing everything (anything) that might help to decode. At each decision point, lay out the possibilities and apply rules to determine which sequences are permitted.

Template bases Approach

In this approach, a collection of prototypical speech patterns are stored as reference patterns which represents the dictionary of candidate words. An unknown spoken utterance is matched with each of these reference templates and a category of the best matching pattern is selected. DTW is used to find best possible alignment.

Stochastic Approach

This approach is based on the use of probabilistic models so that uncertain or incomplete information, such as confusable...
tracking img