Report

Only available on StudyMode
  • Download(s): 78
  • Published: March 11, 2013
Open Document
Text Preview
B Tech Project

MACHINE UNDERSTANDING

OF

INDIAN SPOKEN LANGUAGES

By

Abhishek Agarwal
200101229

Dhirubhai Ambani Institute of Information &
Communication Technology
Gandhinagar, GUJARAT

April 2005

Dhirubhai Ambani Institute of Information &
Communication Technology
Gandhinagar

[pic]

CERTIFICATE

This is to certify that the Project Report titled “Machine Understanding of Indian Spoken Languages” submitted by Abhishek Agarwal ID 200101229 for the partial fulfillment of the requirements of B Tech (ICT) degree of the institute embodies the work done by him Off campus under my supervision.

Date: _____________ Signature: _________________

(Dr. Sandeep Sibal)

Date: _____________ Signature: _________________

(Prof. Prabhat Ranjan)

ACKNOWLEDGEMENTS

This project involved the collection and analysis of information from a wide variety of sources and the efforts of many people beyond me. Thus it would not have been possible to achieve the results reported in this document without their help, support and encouragement.

I will like to express my gratitude to the following people for their help in the work leading to this report:

➢ Dr. Sandeep Sibal, Dr. Prabhat Ranjan & Dr. Jitendra Ajmera; Project supervisors: for their useful comments on the subject matter and for the knowledge I gained by sharing ideas with them. ➢ Prof. Minal Bhise; Project Coordinator: for organizing and coordinating the BTech Projects’ 2001.

ABSTRACT

Language Identification is process of identifying the language being spoken from a sample of speech by an unknown speaker. Most of the previous work in this field is based on the fact that phoneme sequences have different occurrence probabilities in different languages, and all the systems designed till now have tried to exploit this fact.

Language identification process in turn consists of two sub-systems. First system converts speech into some intermediate form called as phoneme sequences, which are used to model the language by doing their probabilistic analysis in the second sub-system. In this project both of the sub-systems are targeted. First some algorithms are discussed for designing language models. Then an attempt is made to design an algorithm for extracting phoneme sequences in form of more abstract classes derived by statistical tools like Gaussian Mixture Models (GMM) and Hidden Markov Model (HMM).

TABLE OF CONTENTS

CERTIFICATE2

ACKNOWLEDGEMENTS3

ABSTRACT4

TABLE OF CONTENT5

1.Introduction6

2.Background Work7

2.1.Distinct Characteristics of Language7
2.2.Overview on Language Identifiers7
2.2.1.Front-End Processing8
2.2.2.Phoneme Recognizer8
2.2.3.Language Models9
2.3.Example: Phonetic Recognition/Language Modeling9

3.Objectives10

4.Discussion – on language models10

4.1.Language Model – Training Phase11
4.2.Language Model – Tuning Phase13
4.2.1.Penalty for zero probability phonemes13
4.2.2.Weights for Unigram, Bigram, and Trigram probabilities14 4.2.3.Utterance Duration16
4.3.Language Model – Testing Phase17

5.Discussion – Other approaches17

5.1.Recognition based on distinct phonemes17
5.1.1.Graphs for Unigram and Bigram probabilities18
5.1.2.Observations19
5.1.3.Conclusion19
5.2.Using Gaussian Mixture Models for Front end processing19

6.Discussion – Tools based on language ID20

6.1.Content Verification System (CVS)20
6.2.Call routing for customer...
tracking img