Preview

Automatic Speech Recognition Systems

Good Essays
Open Document
Open Document
606 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Automatic Speech Recognition Systems
ITM 5000
Kevin Curtis

Automatic Speech Recognition Systems
Week 9
December 14th, 2009
Mike Sticksel

This paper will evaluate several different types automatic speech recognition software packages. The author will address the following questions as it relates to ASR systems: price point of each software program; Whether or not these systems are speaker independent or speaker dependent; Whether or not they support continuous speech recognition or discreet speech recognition; Do the programs offer add-on vocabularies for purchase. In addition, the author will evaluate his level of comfort in speaking the contents of term paper as opposed to typing one. And lastly, the level of organization required to use speech recognition as opposed to typing. The first automated speech recognition system the author will analyze is produced by a company called Application Technology, or AppTek. AppTek is located in McLean, Virginia, and has been in the Human Language Technology field for over 20 years. AppTek’s ASR product is called PlainSpeech, and is used for speech dictation, broadcast and telephony. This program can do anything from a simple chain of numbers to vocabularies of up to 100,000 words. PlainSpeech recognizes continuous speech, offers gender-independent speech recognition, as well as speaker dependent and speaker independent modes. PlainSpeech also offers a scalable vocabulary as well as a scalable number of recognized languages. At this time however, the author of this paper was unable to locate a price for this product on the manufacturers website. (apptek.com, 2009) The next product is from a company called Nuance also known as Dragon Naturally Speaking. Nuance offers several different versions from the basic to a more advanced version for legal professionals. Nuance offers several different accent features from Spanish to Southern as well as several vocabulary options. Nuance allows for custom creation



References: http://www.apptek.com/index.php/plainspeech http://www.nuance.com/naturallyspeaking/products/product-matrix.asp http://www.macspeech.com/pages.php?pID=149

You May Also Find These Documents Helpful

  • Good Essays

    Identify and evaluate the speaker’s thesis. What was the focus of the presentation? Did the speaker address the ideas presented in the thesis?…

    • 604 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    Module 8 Review Questions

    • 318 Words
    • 2 Pages

    Speech generating devices are electronic devices that help individuals communicate verbally. Augmentive communication is important because it helps individuals produce or comprehend written or spoken language.These communication devices can be important tools to help children with speech difficulties communicate with parents, teachers, friends, and others in their lives…

    • 318 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    How do TV chefs adapt and change their language to suit their audience and purpose?…

    • 884 Words
    • 4 Pages
    Good Essays
  • Good Essays

    Text to Speech Engine

    • 432 Words
    • 2 Pages

    A Text-To-Speech (TTS) synthesizer is a computer-based system that should be able to read any text aloud, whether it was directly introduced in the computer by an operator or scanned and submitted to an Optical Character Recognition (OCR) system. Let us try to be clear. There is a fundamental difference between the system we are about to discuss here and any other talking machine (as a cassette-player for example) in the sense that we are interested in the automatic production of new sentences. This definition still needs some refinements. Systems that simply concatenate isolated words or parts of sentences, denoted as Voice Response Systems, are only applicable when a limited vocabulary is required (typically a few one hundreds of words), and when the sentences to be pronounced respect a very restricted structure, as is the case for the announcement of arrivals in train stations for instance. In the context of TTS synthesis, it is impossible (and luckily useless) to record and store all the words of the language. It is thus more suitable to define Text-To-Speech as the automatic production of speech, through a grapheme-to-phoneme transcription of the sentences to utter.…

    • 432 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    Annual day essay

    • 1648 Words
    • 7 Pages

    II. Transcription of the audio file will be performed in Microsoft Word and saved as a 97/2003 compatible document (.DOC).…

    • 1648 Words
    • 7 Pages
    Powerful Essays
  • Good Essays

    Visi-Pitch

    • 590 Words
    • 3 Pages

    There is a wealth of information that can be gathered through an acoustic profile. The Visi-Pitch is a very useful tool for speech language pathologists (SLPs), as it can provide them with information regarding clients’ habitual pitch, vowel quality, and phonation abilities. A huge advantage of this software is that SLPs can use it with a variety of clients, those including voice, articulation, fluency, and accent modification. The information gathered from the Visi-Pitch can be useful in assessment, developing treatment goals, and tracking client progress. As for myself, I would be very interested in using this instrumentation with future clients. If I want to use the Visi-Pitch, I should first become more comfortable with all of its components.…

    • 590 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    Automatic Sentence Generator

    • 3412 Words
    • 14 Pages

    1.- Introduction. The growing, unstoppable development of very high speed information processing computers with tremendous main memory capacity which we see today leads us to think that it will be possible to design and construct automatic speech recognition systems which can detect and code all the grammatical components of a training corpus. As part of our effort to make a contribution to the fascinating world of Automatic Speech Recognition, we have developed a system composed of a set of computer programs. We have observed that on the basis of a model of a small corpus made up of sentences in a particular context, we can automatically generate a great quantity of grammatically correct sentences with this context. Also, our system can effect a linguistic discrimination to the point of rejecting, as…

    • 3412 Words
    • 14 Pages
    Powerful Essays
  • Good Essays

    Spoken Language Analysis

    • 1171 Words
    • 5 Pages

    My idiolect is intensely influenced by the individuals who reside in my house or visit regularly. This happens because I have been with them most of my life so I pick up some language techniques they use. I belong to a family of five; my mum, dad, older brother (Alex), little sister (Daniella). My aunty, whose name is Kenny, lives in my house frequently and various other aunties and uncles visit if we’re are celebrating or just to have a little catch up with what has been going on in family affairs. My family originates from Nigeria, therefore, the native language; Yoruba is spoken and integrated with the English language. No one in my family lacks the ability to converse in English fluently.…

    • 1171 Words
    • 5 Pages
    Good Essays
  • Powerful Essays

    * Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese.…

    • 1668 Words
    • 7 Pages
    Powerful Essays
  • Satisfactory Essays

    In 2010, in the Yerba Buena Center for the Arts in San Francisco, Apple co-founder Steve Jobs announced the iPad.…

    • 529 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Text to Speech

    • 781 Words
    • 4 Pages

    At present most speech synthesis systems use raw text as their input which is understandable from a human point of view but problematic for the machines since the process of converting text to speech is very complex; in this paper we discuss the need for having a specific SSML tag for each “mention” (1st occurrence, 2nd occurrence) of a proper noun in the text or paragraph. We discuss that when a proper noun appears first time in the text, then it is spoken more prominently than its second or third or subsequent occurrence. We highlight the need for incorporating a specific tag in SSML to take care of this mention-case. The SSML format is a compromise between human and machine needs. SSML is often embedded in Voice-XML scripts to drive interactive telephony systems. However, it also may be used alone, such as for creating audio books. The advantage that SSML brings is that the designers of such language generation systems need only understand the basic SSML language and do not need specialist speech synthesis knowledge. Introduction Speech Synthesis Markup Language (SSML) is an XML-based markup language for speech synthesis applications. SSML directs all Text Analysis steps, providing a standard way to control aspects of speech such as pronunciation, acronym expansion, volume, pitch, rate, range, duration, pause, emphasis, etc., across different synthesis-capable platforms. The intended use of SSML is to improve the quality of synthesized content. Different markup elements impact different stages of the synthesis process. The markup may be produced either automatically, for instance via XSLT or CSS3 from an XHTML document, or by human authoring. Markup may be present within a complete SSML document or as part of a fragment embedded in another language, although no interactions with other languages are specified as…

    • 781 Words
    • 4 Pages
    Good Essays
  • Best Essays

    Goździak, E., & Bump, M. (2008) Data and research on human trafficking: Bibliography of Research-based Literature Retrieved from https://www.ncjrs.gov/pdffiles1/nij/grants/224392.pdf…

    • 2889 Words
    • 12 Pages
    Best Essays
  • Powerful Essays

    Using tablets to speaking. The movability and reasonableness of tablets has made them a prominent device for discourse and dialect advisors. Applications, for example, Speak for Yourself and Augie AAC permit advisors to work with people to help them get to a vocabulary of more than 13,000 words – all with only a couple taps of the…

    • 1856 Words
    • 8 Pages
    Powerful Essays
  • Good Essays

    This system consist of devices such as smart phone, desktop server with speech recognition(SAPI) connected to each other. Physically disabled people can pass a command through voice for controlling the desktop system, media control such as previousnext, stop, internet surfing such as default browser, type in notepad, basic PC surfing.User can also perform manual operations such as view desktop, view file, control keyboard, control mouse, open file, close file, shutdown the desktop system and file transfer(FTP) and also even sending mail. Also user can use shortcut keys which are created for operations such as copy, paste, delete. Controlling Desktop system through voice using smart phone. Existing system has drawbacks such as: no file transfer(FTP) operation, no SMTP operation, no type on fly operation, only manually controlling desktop, no voice commands were executed or controlled through existing system. Our proposed system is performing the operations such as: file transfer(FTP), sending mail(SMTP), type on fly operation, and also control the desktop through voice and even…

    • 834 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    ABSTRACT: When you dial the telephone number of a big company, you are likely to hear the sonorous voice of a cultured lady who responds to your call with great courtesy saying “welcome to company X. Please give me the extension number you want” .You pronounces the extension number, your name, and the name of the person you want to contact. If the called person accepts the call, the connection is given quickly. This is artificial intelligence where an automatic call-handling system is used without employing any telephone operator. Artificial Intelligence (AI) involves two basic ideas. First, it involves studying the thought processes of human beings. Second, it deals with representing those processes via machines (computers, robots, etc).AI is the behavior of a machine, which, if performed by a human being, would be called intelligent. It makes machines smarter and more useful, is less expensive than natural intelligence. Natural Language Processing (NLP) refers to Artificial Intelligence methods of Communicating with a computer in a natural language like English. The main objective of a NLP program is to understand input and initiate action. The input words are scanned and matched against internally stored known words. Identification of a keyword causes some action to be taken. In this way, one can communicate with computer in one’s language. One of the main application of AI is speech recognition system is that it lets user do other works simultaneously. The speech recognition process is performed by a software component known as the speech recognition engine. A speech recognition system is a type of software that allows the user to have their spoken words converted into written text in a computer application such as a word processor or spreadsheet. The computer can also be controlled by the use of spoken commands. As we can’t design electronic device which recognizes everyone’s voice,…

    • 1663 Words
    • 7 Pages
    Powerful Essays

Related Topics