Speech synthesis is the artificial production of human speech. It is a phenomenon or a process of replicating the exact voice or sound of the humans by the computer.( i.e. no change in the frequency, decibel level, etc). “A Speech synthesizer is a computer-based system that should be able to read any text aloud, whether it was directly introduced in the computer by an operator or scanned and submitted to an Optical Character Recognition (OCR) system”.(Dutoit, 1996)
Author Joseph P. Olive’s essay, “The Talking Computer” summarizes how the talking computer evolved from mechanical age to today’s cutting edge technology of speech synthesis. Speech synthesis has been a key area of interest to many of the research scholars in Artificial Intelligence. The author Joseph P. Olive points out the evolution of talking computer from mechanical model in which persons voices were recorded and during movie showed to audience as though the idols could speak. “HAL was definitely ahead of its time. For most of the people in the audience a computer was something out of science fiction. Its typical embodiment was an array of tall cases containing spinning tapes, a large box for the computer's memory and CPU (central processing unit), and machines that printed out pages and pages of wide sheets filled with numbers and obscure symbols.” (Olive, 1997)
The toughest problem which is likely to be incorporated while implementing speech synthesis of humans by machines is that our greatest flexibility to vary the shape of our instrument is the vocal tract . “Speech understanding is much more difficult than natural language understanding. Words are often spoken in a run-on manner that makes it difficult to recognize where one word ends and another begins. Great variations in pronunciations exist among people: regional accents can make extreme differences in pronunciation; women generally have higher pitched voices than men; stress and intonation pattern vary; some people stammer and some drop endings of words; and even the same person's speech may vary from day to day as a result of stress, mood or something else.” ( Ghosh,2000)
1.2 The thesis
Speech synthesis plays a major role in communicating with the world by the machines. The aim is to produce a system that understands human feelings, ideas, behavior and communicates in exact pitch as humans.
My first source
Sneaky tricks for speech synthesizers.
Battino, D. (October, 2004).Sneaky tricks for speech synthesizers. Retrieved April 13, 2005 from
2.2 Summary of key ideas
This article describes about the special speech synthesis tool called Spell Catcher which has an excellent system-wide spell checking program. It monitor every word in the program and if it encounters any mistakes it automatically corrects it or announces the spelling mistakes. It’s auto-correction feature is also invaluable, whenever any acronym is typed for example “adr,” the program dutifully types out my address.
It also briefs about the special features which include spelling pane which ignore the words that are mixed case when checked and interactive pane where spell checker makes replacements directly without any warning. The article introduces about the latest development in music synthesizer where it can actually sign our lyrics .VocalWriter is also a full-featured MIDI file editor where addition and edition of lyric tracking is fully automated. “According to the AT&T site, the files it generates are for your own amusement, not commercial use. But there’s a ton of amusement potential once you start making the foreign voices speak English phrases. I don’t feel so bad about making capitalization errors when I'm reprimanded by a French maid”( Battino,2004)
My second source
Tomorrow’s web Today
Please join StudyMode to read the full document